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c 1# iff at * <o sb m i 

Tm? >i> - zmfcz ft % t ^ s mmwfr $ & % mm^?* v * 

(a) SEQ ID NO. 2 Ic 7* 2 ft %> T S. S MMF! : 

(b) SEQ ID NO. 2 IC^Z tl % T ~ S mm&l<DttiL : £.gite<DT = S m.&WT & 

■d r . mniL^mwii. . seq id no. lxtmc^^nsii^^owipjiiCv 

X h V > ->* x > HSi^ffTTA^ru ^VXt«i8»?(C<toT3 - K ft <£ ft T V * C 

(c) SEQ ID NO. 2T*f>?n5 7 5/ iE^lOt/l/hD^07;yiS^JT*fe 
oT\I£:*;Uhny}± N SEQ ID NO. 1 X« 3 tlSiK^f fD*t|S]i(C, 10 

x h >; ^v'x > hs^ftTT-A^ry ^^xts^ss^f let 

(d) SEQ ID NO. 2lC^^ni75/KS^]077^^>hT'goT 

at bb n o 

[ m # * 2 ] 

( a ) SEQ id No.2tc^^n?>z^yS!Be?>j; 

(b) SEQ ID NO. 2fC^;*n&75y&!ffi?iJ©tti£^g{*<D7'5V&?Be?iJT-& 

o T , mtt iL^m&te . SEQ ID NO. I X I* 3 lc 7jk 2 ft Z & ? <D ft fyilklc . 20 

(c) SEQ ID NO. 2 IC 7* % ft Z T S. S <D * >l< h U if <D T S. S m&M T* & 
oT, i*;Hoyi4» SEQ ID NO. lX«3fC5*Sft«;RBE#' ; f©WlRj*ite, 

x h 'J v 5> x y hft^ftTf a><7ij ^VXt5S«»f icio tn - Kft$nt^5 c 

(d) SEQ ID NO. 2t*?n575;Si?J077y^yh , P*?T. R77 

c m * Jg 3 ] 
[ w * m 4 ] 

T SB ;P - 7 38 « * n § ? * * K E ?U 5> fig £> * Ml £ & » ? . 

(a) SEQ ID NO. 2 « n5 7 5 ^ KEW* 3- FfttS? *U*f KEM 

(b) SEQ ID NO. 2 fcTjiZ ft%T 5. S Wtm&l<Dtt±L%Z.mW<DT S. S MfaPl* a 

- K ft * U*^ K ffi ^*oT, KBB?!Mi> SEQ ID NO. IX 
it 3 (Cf>S^l5^i^?©Wlp)®lC7 h V y is x > F4ittTfA-i'7 l J ^ -Y X" t 5 C i: 
5:ltfSi:t57 * l^*?- F E ?U : 

(c) SEQ ID NO. 2(C/T £tl 5 7 = / iE?J©4-^ h D ^©7 5 7SE?J« 3 

- h* {t t 5 7 7 b t f K EM T o t , ^ 5* ^ U * ^ F BB ¥\ It . SEQ ID NO. IX 40 

It 3 ic tK fE ft % & Wl ft =? <0 ft ft m fc x h u y x v l<4^ftTfA/f7'J ^^Xt5 c i: 

(d) SEQ ID NO. 2 K: ,^ £ n 3 7 = y E M © 7 7 > h ^ 3 - F ft ?" 3 3* 

ntf-HE^JT'Sot. K77^^yHi, '>;&< l 0(Dl88t5757i^Sty 

( e ) ( a ) ~ ( d ) CO 5? * U :* F E *J K IB lit! Itt T* 35 S 5* * U # F BS 9\ „ 

[i*«5] 

(a) SEQ ID NO. 2{C^^4a^Z5yS5E?U%3 — F ft "f & 7 ^ 1^ 4" F E F'J 

; 50 
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(b) SEQ ID NO. 2C^?n5 7 5/ iE5iJ(DWiSI«:07$/gE?iJ*3 

— Y \tT % Z t V * * Y m^T* 2b i T s Z \s * ?■ V & . SEQ ID NO. IX 

it 3 icTfit* t\ on famic* v v y *j x y hfc&^TT/wru ^^xts c t 

(c) SEQ ID NO. 2 Cg2tl575y8i5l|©*;l' h D 7 $ / iEJlJ^ 3 

- K it? 5 ? Z V * b ' E 5>J "I* ib o T , K ? * U * ^ h E W& , SEQ ID NO. IX 

tt3fcij*Sti*«llii' : f©ttf«J«K:;* h 'J > 5> x > hft^ftTT'W^U ^^XI5: i: 

a ? * u * few ; 

(d) SEQ ID NO. 2Ci?n575yiE^J077i'*^yh%3 - Fifc-rs*^ 
Utf KEMTftoT, IS^^^VMi, '> & < t> 1 0 ©Bit 5 7 $ ^ 1^ t O il 10 

( e ) ( a ) ~ ( d ) <D 3* ^ U * ^ F E W K: *S ffi W T* fc £> ? * U * ^ F E ?U o 
C If * 31 6 ] 

7 ] 

[ m&m 8 ] 
c m # 3i 9 ] 

CW*J» 1 0 ] 

ai*jaiE«©^n^©^y^F*«ji-rs^te-e*^T, (a) ~ (d) oflhfror 
[is*m i i ] 

m&m2mffi<Dftftfr<D^7?-b"%m&-tZ>75m-Vibr)-C. (a) ~ (d) (Dntlfr<D7 
5yKEW*3-hMt"r*5»^U*^-FE5>J*lS±IBIBl»cSIAL, Ftf^^U**- 

[ IS * rg l 2 ] 

^y^;l/*tcteltSig*3S2l5«E©Mn*>©^^^FO#ffi^^Hit*573jiT'fe-pT, 30 

© # a * «i ta -r * 7? a o 

[iI*J» 1 3 ] 

^>7 p ;U4>^IS^^^^^^Ud r ?^U^^F^*S^-r^^if5A>^W^-ri.^?£ 

o 

[ IS * * 1 4 ] 

m Z •£ , KK*««K'<^ , ^FO«fi6Xtt»tt*««iL/'c*^3*»**iJ^-r*3!rffi. 40 
[ W # 31 1 5 ] 

M # JK l 4 IB « <D # j* ^ T , SdlEHj|lttWE^^^F*«3S-r*«3S^^^-«r$tf 
[H*Jl 1 6 ] 

[ M * H 1 7 ] 

tr IS #J ffl /a 1% o ' so 
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c m&m i 8 ] 

n h mm- ixmmmz > /< * h k «t o $« s nzmmxu&ft&mm? s # £ t- & o t , 

[ US * JS l 9 ] 

E-fr-5ffflflSi:Hai£*:»»£-&. KHIS K © ft Si * ^ 313 L tc if -p £ i»J 3£ 

[ ii * m 2 o ] 

SEQ ID NO. 2tC^2n375VS?IB^Jt^&<i;&7 0%©ffi|B]'l4;£f#O7 T ~ 

y^B2?iJ^ ; &-r?.¥lli;h^^J-'K^^^^^^Ko 10 
C m * 54 2 1 ] 

89**2 OSSttO^^^-KtCfet^T, SEQ ID NO. 2 l£ 7* £ *l £> 7 5 / & BE ?"J i: 
[ W * « 2 2 ] 

tM5#J-«9B»3R'<"/^K*3-hMbLTV^*#««M» : fTa&^T, SEQ ID 
NO. 1 JUt 3 left Z ti & < i: & 8 0%©ffl|3]te*WLTt/*-5«»#?o 

[»** 2 3 ] 

is** 2 2 le-ii©^^^^ tcfc^T> seq id no. i xmic^jnsii^f 
imwvmM&mwi 20 

[ 0 0 0 1 ] 

& ffi ft ® 

* 58 W « , ot-t FD*v'7--(f ->h^DAP 4 5 0 i9J-ftiiit77r 5iJ-(c 
lSLf;ii-tti?y/^g> ffi2*$>*.DNA#?, Stf*>/<*«©«iftfcM'rSo 

[ 0 0 0 2 ] 

g gj s at » > /< g g 

( "dme- ) <Dm mi*. 4*a»KMt«-i8«4t»i|s«s)S8T$ 30 

[ 0 0 0 3 ] 

5feifOS6»Jtefev>T» *»J-ttWB*JR#, if ft fc* # © an ra , if n js © A , £ 8« # f* fc 

H«F-r*o^*i*^-r«. c core®. & *y © r # 3 a , mmmffiva t c n <b comm t <o 

ffi 5 ft m * ft ffi ft 5 C i: co S S -14 * IS ffi L T ^ 3 „ 08 *. » 2£S"J-ttitl»fgCYP2D 
6. ">h^aAp4 50 ("CYP") X — — 77?U - © * > M — , © £ 1£ ti % t/C 

B IK > ta IB # #5 8 , js & Br X * Si>*ta^SM*^#ty^ J£ IS B fc> fc 3 P§ #J © 3 X « iS 
-,tiitt«ofti»K8tUt§o c©«fc3*ai»miW©gftfiJ£tt* f£ *J co ft* %} . x « ± # 

's © s « , & a* $ 14 k m ^ oj ftg e # & & . 40 

[ 0 0 0 4 ] 

^ ffi si m * m ft l t o> § ^ 11 * # « . is it is k v t x if©#*#aisij 

> aUW©ftW«B&±©I*Jl5«:, i»OWM»?Etii;Tft!£2nT*fttf, 51 fiE > CC7 
^n-f(i±HT, fch©*I*S©{£ffl, X«££8yftMi»f!S©*n-:y©ffiffltCflXoTTC 

ts»7, cne.o^^oi@Bij^®T<o#^©issij{cov>Tcoaim*^^e>n5o c n £ © y 

— yl/fcffiv, X8iJ^ffi*&l©5£14#K Rtf^«W«:fiR0fT^«r, «#H;:fc:McjS#-f-I>ltiHc 
fltKttft-tS. $SS i: LT, ft I» £ g # £ n 3 1$ 14 © 31 & U* ft jg ft t± , Ml » 7 

T?*ftv»»tt©ra)a, &t/iwa , r*8f«ffl*iHi5B'r*cfc^i»#*o z zic, \ -Dcomm 50 
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[ 0 0 0 5 ] 

St 3D © m ffl R ffl m US IC it , ->h*nA P 4 50 ("CYP") X-/<-7TSU-, N - 

7-bf ;^7>X7i7--l? ( "NAT" ) . UDP-^;Hd;->;^7V77i7- 
-tf ( "UGT" ) , ^f;l/h7VX7i7-t*, T Jl =1 - ll> -r t K a y - -if ( " A L D 

H " ) , Ha tf 'J ; $/'yf k Knyt--t? ("DPD") , NADPH :^yy* + 

> K U y * * - Hf ( "NOO" , X « " D T 7 * 7 - -tf " ) , * f - 3 - )\< O - =f- )V h 
7>X7i7--b' ( "COMT" ) , y;Hf^>S-h7>X7i7--(f ( "GST" 
) , kX^^>7<^;l/h7yX7x-7--e' ( "HMT" ) , X^*F 7 >X7i7-t" ( 

"st" ) . f^yyy^/i/h^yx?!?-*" ( " t p m t " ) stfx** k k k 10 
D^y7-ftftlti5. B£ S'Jft SB f4 . ^©ftftflSfigtcjct), -0£&t|}c2o©ffl(c:fr$i$ 

n^o is i ©i«ti'giEioii(cMiftffl^si?Lv ffl i i <DMmitft£<DSWimt<D 
m^icmm^m^Rig-r 0 en 5 <d # a « * as © ay ft sh © $ *i§ # % m £ n t ^ 3 £ •? fc 

, SEfti! W 3U4 ?S £ & & cd mm? 3 T~ (4 # 0U * fcf , rS 14 * K « HI © ffi ffl tf. ft? 
«©:/a-trX©-g|5i: L ^^iT^5. 

[ 0 0 0 6 ] 

ffl I co s IS (c It % 7 5 7^ - -tf CD 7 5 y it s x X -r )\> R If T = K «D ftn 7k # ft , 0J * it\ y 
•J y>Stf U >&&£©*§ •g-KJE;, h * a A p 4 5 0^{k/jl7C»^->X-rAtc<fcS^ 
It, &tfl|gJW»lgK»cfctta#JW©«k5ft» »8ffffl7ntXA^Sn5. in 7* # 8? 14 , 

5 © IE - «r g '14 is 4 K a 5 - -tf & a* x X -r 7 - -tf' fc «fc 0 , ± fc 3T K , S Xf to Sg rt T? jg £ 20 

3 „ fTst-tftT^-^oitii, n fc ht m & xf to m * t m m it l , # m ffl y d 

-b X © 7C SB 73- ^ -3 T !/> 5 o SI 7C S J& (i , iC'hBSftOSISSrttec 5, 
[ 0 0 0 7 ] 

ffl I I © » * (4 , Ste»Hfc7.kigteBjHfc©te^tc»JStttffl*ftl£l,» iintci^Tgtt 
, «i£ti*©£ ft ¥WEJSte*W ffl I I © IP JR © 0!l tc tt , ^ti^ti, y ;l/ ^ ^ 
DP-^;l'7Dy>';l/h7yX7i7-t"A^jn5 0 t-^vxxi^-fa, ±icgffi. 

[ 0 0 0 8 ] 30 

Bf « (t , mnv?m®.mm*^ts%3 z<DMffl<Dffi£%fi 5 £&ftgfiiiLT*&-3T . ^tn^nm 
m * m it v xtt*s-&LTi^sffliRtfffli i©»jts©s5#*^*n*o 

[ 0 0 0 9 ] 

^ © ^ , i t$fc^?fiJ-^ffli»^^i:©J:^fc3iiJffii$nT^i.^fcJ:^t©T'^So $> 3 S? 

6 -c* « , jie^^^^fln^-pfeo, g a tt « , st © , ®£.-m<Dmm(D&mfticmGkL 

X ^ %> o 

[00 1 0 ] 

^ m r o ffii ^. i4\ ^ - e #6 'ft & m w m m * is 11 r •§ ^ (4 , * •? % © a « # l 

fc:»S^S^^ A ? > ■rt/^ili(S)^c$,^s#©•!7-^y;^-7°^^b?.o ftifiol£?S»aoHI5l 

(4, ««*St**fflAol#^«:nIfllK:-r*. cogs, S^^f^lW, »»*Slt/£»JRK 

[001 1 ] 

mm<DT u - $ , a^oaoRicBiLftfeRa? orastcu^TiBicfto 

T^5. C4-U4, H « l» C M S L m m tf. ft^fg^14^»©?S'l4{k. ft^ftffl^tffl 
•3t^5fti6TJB5„ Si*^Jfc{4, M*Jgfi!t©S!ii(4. fgS14^H^ei41t;-ri.fflI^^ 50 
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0, ^ n tf £P # ft K , iH£ftfC&5££ftT*5D, ISiftHEfifXKCt^TX^'J- 
y^tlSCi^T'tS. f§ I I 



^fctt<KLfctBlH?iit©«fcDi«</>S§?9tfs ^lO*Sf 
D N ARtfliDH, & ffi © 9 & Jfc fiSc (C ^fz 5 9J ft ffl % 

t -r a c 



— + > y > 



[00 1 2 ] 

HS<J-ttiHifiSt©g3f&S14tf> 

IttOl»t*4e, t hO**EMlLT^4. 10 
[00 1 3 ] 
y K » a A p 4 50 

ffilOia-ftMlOMtlt, ->h^ai^p 4 5 0 ("CYP") 7-/<-77S>J 

- © * ^ ^ - # m <>f n , c © * > /< - k & „ w tc *3 ^ x ft si s n a ± s n m m ix m m 

Y 7 3 OOMSLftSi^tl^^ C tl 6 tt 7 5 / 

»*BI^'ttK«fcoTg;a:S:7r5»J-k:#SSft«o C Y P77 5'J-oii:LTtt, MM 

RtfiftSiofti^aott^'MS^^y/^s^tA.rcC y P77$'j - i, 2, 3 

. 4 Ln^400775'J-ono, » 1 0—1 5 oi«oie?4ii 20 

tf, to^i^i2oft^i%Kit 5. cypx-k-?/?^-©!!!^ ^ -a- 

b T > t Mcffll/>5nA¥nT&gfc::£T©mSiJ©3 % © 8 0 %JW±©ftllJfC«H&LT >/•>£> ^ 
iStatlSo ix.lf> CYP 1 At77r$ l J-(Cli, 7th75/'7i> > 75 h U 
°7 f- V > . £ 7 x > , ^aiftfv, An^iJ K — ;l/ , ^ 5 7*7 5 V, * -7 V if tf y , 
> V -b h o V , 7 x -fe 5 1 > , Zfa/^yxS-y. ^n^^/n — * * 'J > , r- * 7 

< o ^ © F«g £ # m © ft m t fc ^ t , # im ft * & fij * m -r „ 

[0014] 

^<ofr©c y paijR^^^ttomiB-efiFftUTfe*), cmup©a ^ © *> f *p * w 30 

, 3ffi ft tf , S 14 © M '> X tf f? it fc £ o T , K X © S 14 ^ ft f -5 & H il £ ? * £ t T ^ 
5Ci:^l«LTI->§ 0 m *. tf , >ffl M © £ » 14 tf , CYP2C 1 9StfCY P 2D6l 
Ik ^ fc ^ T & < 1$ it -3 tf £ n T 3 o CYP2C19©g3tfCtf. ^o^^^av, V s 

THf^A, ^;75;y, ^^x^-h-y^, ^^o^=k, tf/^v-ji* 7i^f^ 
y> 70/7/-^, Rif7n^7i7yA^jn§ t ctieoie? o^igfs<*ti, 

cnsosa^a^siij^TftfflL, cntfM#^©w^&?&^©}£igk::}o^T5iftiiift& 
ffl * s -r c t* ( ff 

[00 1 5 ] 

c y poin^stttt, cne.©-&figti©^T©^^{cjijs-r-5/c:a6fci^{c^< & tf n 

tf & 5 & f t tf V * , fa* © C Y P S {5 tP £ © * n^f fltf , ^ © & f& g|5 IS tc 40 
j;oTi^Stl5J;0UD»^lifStt5ILT^5. £ 8U ft W tf * ^JOCYPI 
^ £ ffc) © « X tf S 14 © ^ It <fc OilP^nSC kA'TtS. C Y p mw<D^m t LI 

tk , cYPo«aic6tf«ie?2ft ( w a tf , a e ? £ « 14 ) „ c y p t m «r as 4 ffl 

©SfcttStoKJ:* C Y P ft 88 © PI S , SO'Si§tXtii©4«!*ilcii,#£©c Y P© 

Hi*^$tl5. C Y P © U5 frj & t>* 8§ * tf , 3a»|qj*<OjBaiJffl5f^fflO«t,— 

* S o « A tf , CYP3Atr775iJ-tt, <&8KS#*£i;.6l|sit#{btfLt:X*5>' 

»j a a* -> -9- 7° u KtMiLft, ess m ft tc a s 4 gg au ti a it m ic m m l t v> -s . as © m r* tf 

, CYP3A4S»*CYP 1 A2IIIJ, ^ < OH»Iffi»JOft»*a 9 T ^ 4 , ^?>{C 
W««*K«t5. cn5(DiI(DltSS:8!it1tei^lit§C tlCiot, E SP it tf 50 
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[0016] 

C Y PX-/^-7 7 5'J-lc»S?n«SKOii:LTli, a » ft W * , F a * S/ /Mt 

s k *5 tt s & x f -r - s Itc^sois, t tirt Ffrz^mmRzf t fr* >s\(om 

m tc *3 it 3 ^ ^ * * > *\ S 7 -b * - ;1/ 4> R9 W t L T ffl </> fc ifi St f t S lStfSSn5 0 B§ a 
KD^yl/^ + ^Fii, ft ft 7j< m & OF T 7 ;U -r t: F * f# « * fc . S ft p m § * S S « c 

;l/ * * 5/ ffl l± , 5?§8HtEJSO<fc3fc:* ^^^^^iSo^gSSlft^ff^^<^T;I/-r r bF 
[0017] 

c y p tc sa a t fc ik ft s ft w <o m m <d m t l t « > 7^V7^y7^.y. 7 & 7 x y 2 x 

;K 7^1/^777^, 7;^l//D-;K 7 ^ * 7 d ^ % 7U'J^f 'J>, 7 X *r ^ 7 
— ;K 7Xtfp>, ts7^>(y, * t Hf If > , ^p/1/:7xx5^;/, ^ ^ y 'J F > * 
n ^ 7 5 ^ > . ^a^7^y, ^ n if \L y , n ^ V > , n ;l/ t: ^ y , n ;U 7 — ;l/ . > 
^p*X7r5F, ^^PX>f 'J>, ^yyy, 7* 7 5 5 y , f + x hn^ h/l/77 

S>*n7x-*-*. 7 -Si** x^A^-F, x U x a t V , x 

X h 5 * - )V , 7xn^t!y N 7 ;l/ * * -fe ^ > > 7 ;l/ m X ^ ^ > , ad^ij F — A/ , << 20 
7 7 p 7 x y . ^ ^ 7 5 ^ y , y ^ t if ;i/ N y F ^ # i/ y , ^yF7 = >, 

;b # y > U F * >f V > D-t/l^y, v^D7^FR4M, ;>< 7 x x h ^ y , «F>, 
^ h 7 p p - ;U > * > U ^ y , ^ 7 75 A . ^ * p ^ 5 F , ^ 7 p * -fe y , *7 r 7 F 

y , x * ;i/ e y , x -7 x *j tf > , x h u y ^ M y , J b v 7 * u y , t7 yf if 

^^777-;K ty^y-b Fay, * * n F > , ;^>J? + t;K p + -tr ^ y , 7 
x-f-tr^y, 7 x x h ^ > N tfp^ zsljl± s ypyx-rPV. /d/Oi/>, 7p75 
7 p - ;l/ . + x 5; y , U F ^ \£ >l % V * + If ;!/ , -b ;b h 5 U V > *s )V ir + 7 4 > S - 7 

^ ^ U y > ^ ^ * > 7 x V , r- / tl A, T )\s7 if y ^ xXFXt^p 
7^ * 7 << U y , ^-^p-;1/, h ;!/ 7 # ^ F , h 'J 777 A, ^ 5 ^ ^ If > 7 5 X 

f y^tstiSo 30 

[0018] 

2D6^g6(i, S^M, STBa. S WOlttii t HI L t t>S ^ tc W L T ^ 

f t ?L SI ffi > St/IS?itt^XhP7^-*^jn5o 
[0019] 

co-^FP^y^ - -tf ->h^PAP4 50 

*»!BKJ:9I«Sh5*S4l: h>y/^H, i tf ^ n 3 - F ft ^ iB ? « , W 

* tf , F ^ D A P 4 5 0 4 A 4 (CYP4A4) . ->h^PAP~450p-2, 7p 40 

x * 7 5 > v y , to-t:Fp^r^5-"tf, sa'^') u yiw - t: fp + y7--i!^tty, 

ti Fd^7-^->F^dAP 4 5 0 7 7 ^J-[C|iaLT^5o co-b F D ^ y 7 - 

-tf y f^uap 4 5 o^y/^Ka, 7px^7 5>^va. so\ *^yy»t, 
'Jv^^. /<;U5^vKffiOJ:3ftfliJW»Ofl) - (<» 1 F n * S/ ;Wt K tt « ft 5 

ffl^R(?T(Yoshimura et a 1 . , J Biochem (Tokyo) 1 
990 Oct; 1 08 (4) : 5 4,4-8) o C Y P 4 A 4 tt, S WW * ? * 

?>n^> (Palmer et a 1 . , Arch Biochem Biophys 199 
3 Feb 1 ;300 (2) :670-6) o 
[ 0 0 2 0 ] 

Matsubara et a 1 . , J Biol Chem 1987 Sep 2 5 12 6 50 
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2 (27): 13366-71: Yamamoto et a 1 . , (1984) J. 
Biochem. (Tokyo) 96, 593 — 603: Yokotani et a 

1., Eur J Biochem 1991 Mar 28:196 (3) : 5 3 1 — 6 I 
and Johnson et a 1 . , Biochemistry 1990 Jan 3 
0:29 (4) : 8 7 3 - 9. 

[ 0 0 2 1 ] 

#«Wte*Dffl#*ns*>><*K<DJ:3ft$'h*nAte, ± & L fc £ fc An X X , £ < <0 
* £ 4> ft "T 3 „ Bf«Btp0^fa^ K, lit, HftOifcffcfcAnitTv s > h ^ n A (i , * 

[ 0 0 2 2 ] 

■> h^oAOftS^WSO'P C R^-xo77t^(i, SIMB¥lc«^«<1£ t R TJ m K 
$©WSlcffl^6ni5„ >S ft & SB 0§ « H£ »H± , 1$SOi/t>^nAiffiS^fflL, flflflS 
5E*«5|-rs.fc5fcl8tr-rsc: i: # T* £ , cnfc«fcoT»ftW*a5©*rL.i^}fe«#i£a*JI 

[ 0 0 2 3 ] 

C £ & X £ £ o JKtf^tli, a - h 3 7 x a - ;V , & ffe to ft ft W it $J l± , 7'J-7 
^*;VOi^^fl»lt5. 9 iV Z f- * > R If if )V Z * * y 1 ;l * * i/ # - M it , 7 V - =y 
^*/l/-S§f8tt©MflS»«**&©e*S«ttfc:*F4l,T^3o ^TO*>h^ni»*«rar3tt 20 
SCtfi, «fc0«dSWftl8<bl»il:»J©BB«*«ifi-r«ci:i:ftS. #5§B£{Ccfc9}f{32n 

§ @e jy a . w 5t © ft ¥ ^ m m m © m it ic m ^ s c t w x * s „ 

[ 0 0 2 4 ] 
[ 0 0 2 5 ] 

CYPX-^-77$'J-OI5:§lttBflti, Igarashi et a 1 . , Arch 
Biochem Biophys 1997 Mar 1:339 (1) 18 5 - 9 1 ; M 
ed Lett Drugs Ther 2000 Apr 17:42 (1076) : 3 5 - 30 
6 (no authors listed) : Fowler et a 1 . , Bioch 
emistry 2000 Apr 18:39 (15) : 4 4 0 6 — 1 4: Lamb e 
t a 1 . , ChemBiol Interact 2000 Mar 15:125(3 
) : 1 6 5 - 7 5 ; Chiba et a 1 . , Xenobiotica 2000 Fe 
b ; 3 0 ( 2 ) : 1 1 7 — 2 9: andMeehan et a 1 . , Am. J Hum 
Genet 1988 Jan:42 (1) : 2 6 - 3 7 5#S§2 tlftl>, 
[ 0 0 2 6 ] 

c y px-^-77 = v - it, m m © \t m r xs m f$ <o ± s & * - y >v l k # 

fs © # » fc *3 ^ x m m x & % e 40 

[ 0 0 2 7 ] 

U D P-y;l/»a;v/;l>h7yX7i7--tf 

ffi I I ft*KHILfcia«S:3lll|i|fiIffffl'sOB»J4I$oT^5. SI »J ft W K M » T 
£ ffl I I g? i?t © M g & ^* ;l> - :/ (i , ^;l/^ny->;Vh7>X7i7-€, f$ K U D P - 9 
;l/^ayS/;l/hv>X7x^--tf ( " U G T " ) X — f\ — 7 r 5. V — T' $> -5 <> UGTX — 

/^-■7t 5 u — co ^ > ^ — « , ;jg -14 <t -a- ^ ^\ cd 3Rtj k -r— t l- t © u d p y >i ? a i/m<v 
& m m im ic m m if m * r & l , c n<b <Dy^^<D^m^.^m±\. , mmmzm^-tzzfu-b 

xxibZo "BAJSfcfevT, ^/^pyStt, ft SB © J?S S % > R tf JH m 6 i* ic A o T 
#1tU^rt>fc*afl§»te{t^to©«ia©^l»fcflH^&ftTi/'>So 9 )\/ * o y -> ;l/ h ^ > x 

7x7-4f£0flW?fiJRO*Pfl*»JC0jp5#{i^a]T*fe0, &tflR]»tt8*dtr«E 50 
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&mm(oftmic§zwr s^mte^m ^ x ^ % „ 

[ 0 0 2 8 ] 

5£ jg £ n S 0 IK, Bi, ffiD, & l>* *ffl K tc l> T t± , '>4< fct 1 1 0OI45CYP 
7. — /?-775'J - — fcJ5ftTl/> So fc: Mc *5 V T fi, 3 'J- 

£ S £ ft, 30(077 5 'J — # |3| ^ £ ft fc c 145UGT77$'J-T*li, T 7 ^ / K IB 
M <0 ffl IQ 1± 4 5%*iST'S5tLT$iitl ; t7*7 7 5 y-Offttt^e 0 % © tl |B) 

■tttffcSo ugtx — — 77;y-o^A - t±^e.{c> ifi^a, ai(c*5^ti 

Ui^nSU D P^'U 3->;l-b7>X7x7--lfX-^-7 7; 'J-0-ST'$li 0 10 
[ 0 0 2 9 ] 

ffil t# U G T g?|g ©fSfiJti , (cfc^TiS^feOi: LT, Kffitfffi 

SoTi/^s 0 ucTiiti, £<<o«gfciP]»?*ai*£ft£-e, isi ft g & is #j ti s: ^ 
m<D&{t<D±m%;iM@x* & % o w*tf, ^ > y -s tr >, a^-tr^A, * * d- -t? a , 

&tff- vHf/< Ati, Rtfk:i#tti;£ft.&fl»fi:©#, ffil IS 

[ 0 0 3 0 ] 

til I ff US (i „ »S5tt«>St©<J:5*feBH£toH*ttML, i I I » 3R * 3 - K 

j» #j * « m m n © m tc s 4- 1 « ^ © . &m&mn<Dm-mmWi$kfr *> numt^mmz ft 20 
RVft?3LVa¥<Dm9i<Dxm*. a « 1± ffl 1 1 gf m if§ » su & t>* ^ <o * - y -y h * p /£ . s 

If # & rS if Z Tc tb ic m ^ 3 C t T «5 5 o JS5 © ft 3* ^ 85 X 8J fc ft * f S it {2 ^ © pq ^ U , 

*neoiB*wft^*-xi»oOT55*ss»cL, aa a ? n g& , # * sue it , s tf « « e » 

[ 0 0 3 1 ] 

uGT»*»csijibfc«jw*a{t*-&«a6sijoeijfcUTtt, z = h y 7° >; > , vfuy 

)\s 7 4 y „ ^D;l/7n?-;y 1 ^ d tf > , n f-* f v , > 7" p *\ 7° £ S> > . {'tHPaf 

K^-trtf^. -CST*^?^ 5*hyj?>, D 7 if/U, * ;t< t: * , tn*70 30 

[ 0 0 3 2 ] 

m « , ugt iie?©saej;?)fic5iirftft*i»»f raBo, uct 1 a 
g s « , * u y 9 — t ■» - fi i s ic ii ^ l 1 1/ 1 § c ttfiEwsnft. 

[ 0 0 3 3 ] 

UCTX-;<-77i';-tt 1 & CO ft ffl & t>* Bfl SB © ± S & * - y -y h T* & 5 . Lfctfi 
X . «6**aoUG TX-/<-7 7 5i;-«D^y^-%BtL«ltr5y5i:i:H;, IS ffj M 

[ 0 0 3 4 ] 40 

^SiJ-f^^PH, «HC<»-t KD4 ; >'5--tfi'Fi'DAP 4 5 OlSI-«l»lf 77 7 
5 U - O ^ > - tt % SISiJ<Of^fflat/BBao±Bft^-yyhT?feSo LftA'oT, CO 

, ^S'JBa5e<D^fflPtcj3l/->T*ffl-pfe5o * fS {J , (o — k K o + -> 7 - t* •> h ^ n i, P 4 
5 0»»J-'reW»j|i-9-y77'5'J-©^>/<-4:(Offl|^1t%Kft>fc, ft * * lift B <D k 

o 

[ 0 0 3 5 ] 

56 0^ <D e 
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t: 



nf.ojttiief f & £ t>* ffe <d nf ?i ^ tc & § c n £ cd 

5, cn'&cD^gft^'T'^FE^K St>*CtX?>0^r^K%3 

, sssij-ttw»iR*s*i"r*ffliifi&tf 
f cd m & m <o m 5g ts w- -s * - y y F 



fi cd is fc & ±i •£ , * 

rg 14 * 11 Si? f 3 «fe -5 * 
— * fc <fc -5 £ , t: F 



ft 
t: 



C 



sa as /sb as, 32 

0 0 3 6 ] 

W CD & «B * Itt B£ 



3r ;l/ F a 

£D — p|5 (C 
— K ft t 

nam. t* 

m 1L flf .. 



y 

tc m 

<D 

■O o 



^ fc t, CD T* 
B£ E JIJ it , 

$j - at at 

1 cDH 



» ea 



0 



# #g IS fi , ^aj-ftlltB#fSi:7r5y-©*://**gcD;rf>/s-T&5fc|^^£ft5*://* 
^S^f - Kltt2MSE5>J*l«t5t.07S9, C ti £ It w - t: K p * -> 5 — ■*? 

~>h^DAP4 5 omm-ttm&m-v-fy r s. v - icmmttv ztiz (. & 2 iz * > t> 

HEW, BlK6l/cDNAE?l, 13 3 K y y A E ?U £ * "T ) o BZKjsjni^^f 
K SB ?IJ , IBI «l tc * m m m tc 12 3c 5 tl 3 *t ft ^ S i* , If * W *ffl • R O* El 3 CD 'IS $g % ffl ^ T 

iRi^*ns»3i*af*:o<k3ftwe.*»ftaa(**i, cc-ett*«w<o3Sffj-«»»jis ^ 7 

^ F , SffJ-f^afifjR^^^-K, X{i*f8B^cD^7°^F/^V/^Mi:P?(fn5o 
[ 0 0 3 9 ] 

fcS^ttcnfcdtrJIilB^^F'Rtf^v/^iatf^fcfflet-rs (H 1 ft* 



10 



20 



* « Wl « , t F y y A cd SB 5»J ft? ffi ic S -3 ^ T ^ § „ t: F y / A cd E £'J 8? #t & Hf S tc L 

JlffJ-f^W»jR-9-y77'5 V - ic ti6Mtt H b ti & * > * V. / 7 7 K / K ^ -/ V (c » I 
T , «|ji&a t /XtiE?>J©fflRI , tt«:W-r*^7 F ^K*3-K{tLTU^*, t /AOl 
**«»077y^h*ilfl6*>i4oft. C ft 6 <D E ?U £ ffl T . W *P W ft y / A BB W 

tto¥Rtf/X«:cDNABe?iJ£¥l8tL, ^^{tttfco C<Dfi?«TfCffl^V«T, * 
fS > ni-tFD + ->7 - 4f ->h7nAP 4 5 0 igiJ-f?IB»f t77 r 5 'J - IC 

K & * - K{fr*K¥»jlBoa»E5>J» c D N A E 5>J & / X 

«yvAE?'j. ^bs^s , aaoaa^ft, Rtf*fsw©i6»j-'rew 

S#*fC^tLT^|jgXl±E?iJcDffl[B]14^*-r^> S&MiilicDii&t^gSEaicD^V/^H/^T 0 

^ K/K^-ryfcia-r*i««%ffl«-r*t)©i»*So 

[ 0 0 3 7 ] 

* ft B8 tc ^ v> t m m. z ti z * -f =f- f it , ce**»j-efesci:fcftp>iT, Kiswfcagftsy 

If IC , *ieB^cD^y^F«, m-t Fa^v"5-Hf ~>F^aAP 4 5 O^^U-^KtBI* 
TffiPl'l4&0 : /X«^)g±cDffll§e^WLT^i)C klcSd^Tl«?nS. 0 1 O^i 30 

f-^ici^t, t hoi, m (?Li?%#cy) , ?^rtus«*, wszflsi, mm. n* 
Ftc fc-ttsianwfigis^eiTi-rs&cDT'&So ^feB^cD^y^- k © «t 

ftttK&a'^cDfEffllcHaLTti, ^BSffl*. !(ffcS«SW, l2lfficDafi?fcIB«$n, 
/Xfi^8"J-'RiliBI^^V^^S<DBl59]«D£o-t Fa + -> -7 — -tf F * u A P 4 5 

[ 0 0 3 8 ] 

[ ^ m m cd as ] 



40 



50 
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^tl5ii^?T'S?ilE?/c D N A, X fi H 3 IC 7* 2 n £ V V A BB 5>J «fc 0 3 - K (t £ 
[ 0 0 4 0 ] 

« w © 7 f a , mnjui.<&(Dm&ic%: % % c ttfT z z o ihou^ab 

o ^ t (* ^ a? -r s ) o 

[ 0 0 4 1 ] 
[ 0 0 4 2 ] 

rHK»*=<k*JW««lKXttfl!!Oft*«J«*^*«:^J t v> 5 ffl g§ {± , ^ figlc BB ^ L ft 
*89«g«sKXtiffio{l:^1ftll^5»K*nrc^^^- Fl»i*i>9. & 3 0>J t U . 

r£KWKfb¥flllB«lKXIiffi©<t &<^J fcte, <t 3* M IE M X tiffi © <t 3* 

<Bo H * 3 0% ( & & « 8 ) * SS , {t*m««JHXt4flfi0^t*»K*«l2 0 % * $5 . ft 3* 

[ 0 0 4 3 ] 

5 o 13 1 K ^ "T iH 1$ -r - £ & , 1 1 OlKf-^tiSt^ tKDl, JH ( ¥L !E £ ^ ty ) 

, ?ntommm. wscjr, ri, Bias/Bias. 5ittit> jug^ stfiFfflBs 

[ 0 0 4 4 ] 

Lfc#oT,:*:«Wti, B2tC*£tt*7' = /»BE?lJ(SEQ ID NO. 2) 

^V^i'I, ixlf, 01(Ci^n5K¥/cD N A^lEJiJ ( S E Q ID NO. 1) 

, Stf0 3(C*Sn5y/AE5IJ (S EQ ID NO. 3) IC <t ») 3 - K ft J n 5 ^ y 

5 V ^ SB ?U # * y H © » *§ ft & 7 = y ^ E ?U T & & i; £ , ^ Rli7 5/iS?J 

[ 0 0 4 5 ] 

#fiB£li2£tc.02fc;^2n£:r~y&IB?iJ(SEQ ID NO. 2 ) 5> H H ftfc 

5*>'<fJl. MAIf, H lKi*ft5fi?/c DN AfliKEJIJ (S EQ ID NO. 1 
) « SO"i3lC^Jtl5y/Affi?| (SEQ ID NO. 3)KJ:D3 — KffcSft**:/ 

fi\ a»W»c*v/<i'Sl^Jc|?ii~ioo8fi<oW4nja», ft S W tt l ~ 2 o © ftp @! 

[ 0 0 4 6 ] 

0 2(cS?n57;yiEJlJ (S E Q ID NO. 2)%$t?*>/^ 
S, iitfH 1 CSSti5fi?/c DN ASKEW (SEQ ID NO. I) , RI/B3 
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KSSnsy/AEJlJ (S EQ ID NO. 3) ti D 3- F{kSft5<i>'^fi*fi« 
t«t>OP*5. COTS/SEWfl 1 , *>^^j|<Dg**ft&7 = 7&£@2?iJ 
-g|3T'fe^«^, Ha75/»E5>J%tCf. C<D<fc?&if^s * >/S ^7 H « ^ 7 

f FO*T*8fr, X U * > /< * n t £ M tc |g Q* -3 T 5 7 5 V Wt S , 35 5 ^t475 
/g£S»/^7^F@2?Ufc#fflra<D7 = /8$gegcQJ;?&, ft *DWS 7 5 y ( n - 

f ?t <* n bb ai t m m -r s ) tstsiittfTts. iKoi^^^^n^ 'pm<o<t 

[ 0 0 4 7 ] 

# 5£ M © S - ft W 8# iR ^ 7 F li , **5X 14 # ffl 

m tt © G ?"J {c Ifi -T § C £ W T * 3 o CCi^^^fiD'i^^/^Sti, * »J - ft 

ixmmm^y^- f t^mm * > <? is- i±<p?m-& lt ^ z c t * * t . ^ *s in * 

[ 0 0 4 8 ] 

l> < o *>> <D ffi ffl t? l± , 14 -n £ y/^Sti, SiaiJ-«W»3R^^^Fii*0}g1tK:»«*R 

ff S * ^ o 14 * > /< * H « , WAtf, /3 ft =y * V y ¥ - -B Hi , h 2 - 20 

'J 7 K C A L fit □ > # U H i sB^, MYC11, HI«l«&tfIglk£i:^ofciPf)S» 

»c*ijhi sS6^(i. mfrmzmm-Kmrnm'*-??- h' <DmmicmmT& z o & 5 at © 
m ± ffl ua ( tai * « * ?l s <d m ± m as ) k. & ^ t « , ^>/<^K<o«^atf/x«»is&tt, 
if ti ihj is se 5ij * « « c t k a o m *p * « c t w t- # * o 

[ 0 0 4 9 ] 

**^X tf *P © *1 « x. D N AgffifC i 9 SI JS f S C fc T # S . 

0IJ * tf * I^5^y/^SE^J^3-KLT^5DN A77^"^>hli> ft 3fe & ffi tC ft o 

*fi«tJ:!)^jSt5i:i:*«nrffi , e*«, S , 'MBit 7 ^ ? * > h (D P C R mmiC 30 

yyA-r/^-f-^-^rffli^ 2o©iiwsig?77^7 > b micmistto 

IU ^©t£> 7 T £ c i: tc «fe D , *^7lfifiJiJ*Si«t4i:i:!!i'T4« ( 

Ausubel et a 1 . , Current Protocols in Molecu 
1 a r B i o 1 ogy, 1 9 9 2 #,1) „ ?5IC, fS^^^-fcLTtt* K (C 14 g|$ 

15. M - ft » 1? IS ^ 7 ^ F £ 3 - F ft L fc: M ^ 14 ffi 7 U - A 4> T* S ftj - iX M 

mm-^-f^ f k m ^ -r « £ 5 tc u, c^«t -5 ^^g^^^^-cptc ^-->yjn5 c 

[ 0 0 5 0 ] 

w ± v tz <t ^ , *¥£wit£rc ^mn^-f^Yf&mmm. ^7^ Kowaie? / 40 

%i«i3 (c ^ n t fl 7 5 / iie?H t i n4 ^ t o t ^ 1 1» s ft 1. , 

[ 0 0 5 1 ] 

^ 7 ^ F ic tt f 5 @2 ?U & Q* / X « W ± to m 1± S -5 ^ T , . m <D ^ 7 ^ KtSlKKSJ 
t5c!:tft't5. fflHtt/B-ttOlfili, ± (c , ^<y^F^«lfi6W^Sf*T^5^X 50 
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[ 0 0 5 2 ] 

Jttt*fT3BW"C, BB ?iH± 7 ^ > £ n 3 ( M A U\ SSg&7^^>^>hOfc£>tC, * 

♦BnttBBWttJttttfTifcfetejfca-rscfc^-efpa) . & m * t l t « , s*ewo 

l?O / >4<it30%, 4 0%, 50%. 60%, 70%, 80%, 9 0 %X«f tl« 
± ifi , tt«a«(CJSi;T77^ V3ti5o fit, WJC-rsrs/REBXtt^^L/*^ 
K ffi B ± O , 75y»»axii? * F^ttK«ft4e »-E9JTOSItf, fg ~ BB 10 

w ^ r w is f 5 ga b £ i; 7 5 / saaxtt? * u * Kfci^tAtoenn^ t 

» : F»4*OBEB±-era — -pfeS (CC1?ffl^5nT^8 7 5/ S[Xft«80 r [WJ — -fi 

j a, T^y^xti^^o rtai^ttj traaaiTfes) . 2o<oia?ijwiora-it^--b> 

[ 0 0 5 3 ] 

SB W o Jt K , Stf 2ooBWHKJ3tfSW-tt, 91 ftl 'It - -b > b <d St S tt , S¥W7;l/ 

=f U X A ^Sr ffl T fx "5 C i: ifi V % %> (Compu t a t i ona 1 Mo 1 ecu 1 a r 
Biology, Lesk, A.M., ed., Oxford Universit 
y Press, New York, 1988; Biocomputing: Info 20 
rmatics and Genome Projects, Smith, D.W. , e 
d., Academic Press, New York, 1993; Compute 
r Analysis of Sequence Data, Part 1, Griffi 
n, A.M., and Griffin, H . G . , eds., Humana Pr 
ess, New Jersey, 1994; Sequence Analysis in 

Molecular Biology, von Heinje, G. , Academi 
c Press, 1987; and Sequence Analysis Primer 
Gribskov, M. and Devereux, J., eds., M Sto 
ckton Press, New York, 1 9 9 1 ) 0 Sl^ifc Lt ti, 2007 
S/UEMIIIIOB-tt^-tVUi, GCGV7h9i7^y*- c ; ( h t t p : / / w 30 
ww. gcg. comtAfpJI) (DGAP/u^7A(C$I»iiJtlfcNeed 1 ema 
n • Wu n s c h7/l/i'JXi (J . Mol. Biol. (48) : 4 4 4 - 4 5 3 

(1 970) ) £rfflW B 1 os som62VhU*y^X. XttP AM 2 5 0 7 h >J 7 > 
X, 7^11 1 6. 14, 12, 10, 8, 6, 4, SJliK 2, 3, 4, 

5, 64ffl^tft^?n5p m<D!ffm%imtLT&, 2O0?* U^^KEMHOR-tt 
/^-t y Ki, GCGV^h^x7 f /^y^r-> ? (http://www. gcg. com 
T'AlPlffi) OGAPyD^7A (Devereux, J. t et a 1 . , Nucl 
eic Acids Res. 12 (1) :387 (1984))^:fflv^, NWSgap 
dna. CMPThU-y^X, 77 P li 4 0, 50, 60, 70, 80, ft £ IE it 1 

, 2, 3, 4, 5, 6£ffl^T&/££n£o £ £ {ffi CO — MtLTIi, 20(07^SX 40 
U: 3* * U ^ K SB 5U m (O n - 14 ^ - -tr V h « , A L I G N :/ u ^ 9 A ( /<: - ^ a > 2 . 0 
) lCj|H#&$n«:E. Myers • W . Mi 1 1 e r 07;^U Xix (CABIOS, 
4 : 1 1-1 7 (1 989) ) *m^, PAM1 2 01181^ * y 7 &*<i- A* r 
>r 1 2, ¥ * V + Ast << 4tffl^tftS?tl5o 

[ 0 0 5 4 ] 

* m <o m m r zf * > n bb ju « , «^ifffio77> u - x a n 5i u /c bb * » a *r 5 

fctblc, SB yij -r - # ^ - X *c fcf L T ^ * ft o Y ^ x >J - BB JlJ J tLTfflt^n^Cttf 
T'tSo COcfc^&^itt, Al t s chu 1 SON B L A S T, StfX B L A S T^D 
^ <5 A ( - v> a > 2 . 0) (J. Mol. Biol. 215:403-10 (19 
90) ) *ffll>tfHC t^T'f So BLAST?>l/*f K««tt, # ffi W <0 « » # 50 
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¥lcmm&<D& Z 5< <7 \s*=f- F ffi 5»J £ » <5 *6 tC, NBLASTyayjitllv sc 
o r e = 1 0 0 , wo r d 1 e n g t h = 1 2 fin C ttft-f 5, BLAST^'^ 

n&mte. *^0^<o^^^^KfcffiiBii4©fei,7 r = y^gB?ij%t#§/c46tc, x b l a s t 

^ny7i»^fflt\ score = 50, w o r d 1 e n g t h = 3 T*fi9 l tA'T'f I, 



1 ?>£>Gapped 
3389-3402 
dBLASTyn^7 



It K g W <D ¥ * v 7° 7 =7 4 y * > h & m % tz #> ic . A 1 tschu 
BLAST (Nucleic Acids Res. 25(17) 
(1 997) ) *m^%£ttfX*%Z>o BLASTS Q'Gappe 
A * ffl <5 IK »C tt » ?=? A (mz.l£. XBLAST^NBLAST) Wffi^O/^ 

[ 0 0 5 5 ] 10 

* ?§ BE <D ^ :/ F © -5 <D — -DZ-gts 5> v * M fc *s v T , /n-(r->yy%giUtu<Oi 

5 m m h ibi at tc , mm-fa 4? xmmit. *§$w <o mm - ixmm m ^ ? f <d ? % co 1 ^> ic 

WLT^±^ffi?iJ|B]-tt^#L> # B£ © 3j§ gij - ft 9J g# fit ^ ^ KfcPOUite^ffilHfcJ: 

Hfblti^aefli, (03ic^?n5<t: t: h fe f* 1 ± 

K-e-y/Snsy/A^^llCfigLTSO, CtlliS T S&tfB A C^y-fy*-*®* 
[ 0 0 5 6 ] 

aiSiJ-^WWSR^^^KOjWAjie^ffiatttt, S »J - ft SB S£ iR ^ 7 ^ F <D '> & < tt- 
gMcWLTilfi© ( « Ll^ ) IB FUtSIB] 14/Ib) — e«ST5t h?y;^gT'55<:tt| 20 

« »c , * % on © & - ix m & m ^ -f * k i: p) u a e a m ic ts ^ x 3 - f it s n t ^ s & 

ic 7f, S n £ <fc 9 fc) t F^fefrl ilC7 7 7*Jti5y/At»J:lcfiiLTfe^ cn tt 

5 T S & If B A C^v:/?*— *©£5fc£36IK©SE»K:J:-3T£«f*nTV»5o C C T» ffl 

v> 5 n s t§ , 7 5 y & be ?u ic *j »,■> t > & m m (c ti '> & < ^ t> *tr 7 0 ~ 8 0%, 8 0 — 9 

0%, 2^C«S«Kti'>&< tti^ 9 0 - 9 5 %, X« ; enW±©ffi|H]14% ; ftf-r?)ti^ 

6 x h y ^v> - x > h ^KtOTT*, aBSy-ttatB**"*:/?- F * 3 - K{tf*8»»f fcn 

-c y y ^ f x f s & is be 5>j e «t 0 3 - f ft s n « . 

[ 0 0 5 7 ] 

0 3 (c » *«W©S6S'J-'rewafJlS^y/^Sl*3 - F it L T 5 3B G ¥ ic « T It tB £ 
tlfcS N Pf«*St. ffA/:K£^g{* ( "i nd e 1 s" ) ^ttf S N P(i> 4 5© 

g&37?u*^FffiB-r-6tig£nrc 0 c n © s n p iz £ o e c % 7 s / mm&i © ^ ft 

t5Ci:tfT*f 5. x * y v , OFny, &O r ORF©?1-0W© ; eti ; en©SNP©{i£B 
[ 0 0 5 8 ] 

X, &3gJS©^L^BE?iJfflPtt/|B]-14%;ffL, t hOl£?lcioT3 - K<t?n, 

tt £ $u # > btirzffi&xit f * -r >*mvx 
z icmmwic urn 7 o%ix±<Dfflm&&%i-Tz> 

vyx&Ztmz-Zft&o ccoi^^/^a^ti 
, tfl/- h ') h&*ft©TT' 

[ 0 0 5 9 ] 50 



ft Si W (C (i '> 35: < m 6 0 % JM ± , 

-a-, 2o©^>'-'^gtiftsyB^^:/^^ 
^sij-^ai^is^^^ f * 3 - f ft -r 
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mm-Kmmm^-?^ v <d* >\s v u v\z. mm- ixmmm^y =f- k <d '> & < t & - gp k *t 

£ e> tc » m tc a s s m e> # sit £ n . t h <d % & * - y y h & t/ m ss ^ vj © m & <o tz #> tc 
h*>p>x h >->*x> h4*ftoTT. k % ^ - k <b -r s & m ft ? 

TV ^"-rX-TS <t 9 * « » E ?>J K: <fc 0 3 - Kfb £ n , c n fi * 
[ 0 0 6 0 ] 

t^ct^T-f 5. c<o<fc9«fl{*f;:{i, lig #J - aSJ 8? » ^ 7" ^ K © 7 5 / & 12 ?iJ * fc 

if, i«!©iiht, fii^wrs/Bftinaa* *»f&ns. cot^ii, g m - k m m m 

fcOT*$5„ « # fft B «HC *5 T «: , III07 = yiA ] a , Va 1 , Leu, I 1 e 
CD 4 1 — OftMtecD — OICI^, FD^y;l/5ISS e r iT h r i: © £ ^ , & 14 St S A s 
p «t G 1 u £: © £ $! , 75FS|As n t G 1 n^Oll, SlftSlLy s £: A r g t 
© £ , 5!fiSlP h e tT y 1 SSS t S*i4 V> 7 5 / B8 

t5it[lCO^Tii, Bowie et a 1 . , Science 247:1306-1 
3 1 0 ( 1 9 9 0 ) tCil^f.ntl/^. 20 
[ 0 0 6 1 ] 

^mmm- ixmmm^y^ y it . (ciilt^§*\ xii, p a h *s -g, , as 
y >mitm. m # e £ a- &e © «t ? 4 m 14 © - o « ± k & v t « *g # x & l t ^« c t 

3 i/Mi ffl T* © a S {* © » a< # $ ti 3 . HI 2 ic , 2 >• * H m #t © S & * £ ti X is 
©i?%l^ti, Hltc*tLt$§Sgy7XXli-7^tx ©IB f^RiftL i: # & 5 . 

[ 0 0 6 2 ] 

ttmm&^mteicit, at s w a , -ottio^s^wsTs/ioift, * * , « a , 30 

[ 0 0 6 3 ] 

(Cunningham e t a 1 . , Science 244:1 
08 1-1 085 (1 989) ) «©SK»»K:fe«/^TBE»lO^ffifCj:0. ft K 0 2 IC 

, iao77-V»«8a*IXtJ, LOSSt^clSft^?!^ © ft , £ 8J - R 
»#JlS«tt©<fc5*£1%ligtt, X (4 I n v i t r o«St?gtt#flT©J:?sS:7 , -yte-r©fc 

46 tc SS K « n ^> o $S a S#/1SIS o (c i: T I! §5 & ol$ fit (£ , IS H e 0 , & Si « , )t ^ 

ifitt7^;l/f OliSffiKioTaS^nS (Sml th et a 1 . , J. Mol 40 

Biol. 224:899-904 (1992) : de Vos et a 1 . Sc 

ience 255:306-312 ( 1 9 9 2 )). 

[ 0 0 6 4 ] 

[ 0 0 6 5 ] 

7 7^^>ni, cctffl^snsi^. ^gij-ttitgifg^:/?- FKB5Sts7$yii 

1«> tt8> 10, 12, 14, 16X«?-nJ-X±^A,-c'M / ^ o ilc0«t947^ 50 
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tf, 8X«f ntt±©7 5 7i©^7 , f KT'^S, C O J: o 4 75 ^ ^ y Hi, H 3y tfHC 
ti , fi»J 7L ff }g 14 g|5 {& , Hi F^-f Xfi»flliB£K*-f>'©«}:?fc, Ifi »J - ft W # JR 

K^^yXlitf-7tW77i'"^yh, SWtt^T'f F77^^> h, *t Hi ft « ifi £ W 

as & t± , sit ict-jTSiicAfw^si^o^y^a-^ ^n^7A (caj^tiPRo 

S I T E # #t ) t£ «fc 0 , §IK:igt5Ck^T*?5. C(0<t;^j:^tff^a^ I O^02 10 
(c 75 T o 
[ 0 0 6 6 ] 

# u y ^ k ti , -estc2 o^^r-y^^Bf^tiTv^s 2 oao7 = yswnoT5y 
* $ n w « o mm-ixmmm^??- k fc « ^ t s tc £ t, * - & m & ^ n &c o t> t « , 

iiw^t + xf, pae^w^ta^Ro'^^tciejzE^tiTfct), c n ti us » # tc ;& 1^ t a 
sit? * (cne>co#i4©i/-><'o^t±0 2n:fct>Teaig^ns) 0 

[ 0 0 6 7 ] 

Bt ft] O m m t L T It , 7 -b f - ;b ft , T is)l<t, A D P 'J * ->;Ht, 7 5 Kit, 77 If V© 20 

ft ijp , flg St X fi fl§ S S§ » © ft # IS An > *X77f -7/K/->l<-;l'Oft|g^(«t 
ia , S§ ffi *S , « {b , -7x;l/7^ K tt ^ , fltt ^ ^ ;Wb , ftW£S-£3g*8ojgft, -> 7. ^ 
^(D&tisL, \£ a y ;l/ * 5 >mm<DBi&, * ;l/ = ;!/ ft , y - * 4/ 4? * -> ^ ft , ^'J 3->;Ht 
, G P I 7>*-««, k Hn + j/Aft, 3 7 1R ft , ^ ^ ;U it > ;UXh^;HL ift, 

, 7 /b^^/Wt '<*fl^©75./»©lK8'RNAMttfii ftp , it+f Xt^t 

[ 0 0 6 8 ] 

C <D cfc 5 * »f tt ^ IS # tt MI ftl T* to K> , f4 ^ £ ft (c *5 V> T ft ft fC P *ffl fC IS SB Stl T £ fc 30 
„ ^i]3j/*ft, IB H # ia , BSf & ft , ^^?5yiSSOy - A/V^^-y/Ht, t Ko + 
-> /l/ lb > A D P V X > Jlit * if » f$ (C - m W & |£ gfp »± , Proteins - S t r u c 
lure and Molecular Properties, 2nd Ed., T. E 

Creighton, W. H. Freeman and Company, New 
Yo r k (1 993) CD i ? , % E <D f 4^ 7. h *5 V> T 5 *l T § 0 CO^ 
(CltSili^iii: LTii, Wold, F., Posttranslational 
Covalent Modification of Proteins, B. C. Jo 
hnson, Ed., Academic Press, New York 1 — 12 ( 
1983) ; Seifter et a 1 . (Meth. Enzymol-. 182: 
626-646 (1990)) and Rattan et a 1 . (Ann. N. Y 40 

Acad. Sci. 663:48-62 (1992) CD £ 5 ft < <D XM. * ft\ F8 

[ 0 0 6 9 ] 

l tc tti x , ^-nmcomm-ixmmm^.-/^ v \,t, ii?ti/c75/iiSiA , ief 3- 

K K J: 0 T 3 - K ft 2F ti T if ■» & , l»i«<SShTW5, j«»lKlSiJ-'rtW»*^y^K 

tf , i/yf 'J3-*) tatLT^s*, x t± w fin 5 y m 1* , ffll *. ft JM is &J 

ft^^^ij-^»l»^-^7 > 7 t Ktfip l/T^5 t v^o ft , K^^XtiSMftt^^rtS^-r^t 

© T & £ o 50 
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[ 0 0 7 0 ] 

II ft & 7 -y -tr -T (c 43 T , WAtf, ffi {* iS& & 3 , Xt4flfe©ftaE£JtS*»iaS^«fc*6» 
■b-f (cflj^ S H3S (7^/KtWI8*#tf) tit, *Sv*fi» SlStSi'y^i'SifeaS? 

<fc o „ is m ? u - v » xiisisst lto*7 h7"*-T7 h'NOffl^^pnta's. 

[ 0 0 7 1 ] 

± fc 3flJ K L ffi ffl £ 2? f£ It ? # ft t± , ^MglCtiXfflftitDCtX&Zo CCD & o ft ft 
m*ffl7KLT^Z&m3tWL£ LX (i, "Molecular Cloning: A La 
boratory Manual" 2d ecKCold Spring Harbor L 
aboratory Press, Sambrook, J., E. F. Fritsc 
h and T. Maniatis eds., (1989) "Methods in E 
nzymology: Guide to Molecular Cloning Tech 
niques". Academic Press, Berger, S. L. and A 

R. Kimmel eds., ( 1 9 8 7 ) A'fcS, 
[ 0 0 7 2 ] 

mzw <d ^ ? v <D m & & m m a . ±ic $ > * ioeiics-s^t *s <o , pi sue * > 
as . wss> m\wmm, m^ym^. &mnm%t. m^>, Rtfm-mmmmx^mt^ct^ 

s &m y - tf > 7 □ <y r- # #t *C £ DS^ntl^o P C R-^-XOfflil&X^ "J — —Z/?/\ 

-^y/'i-ostts^atst.ok Ltissnt^a ( w st & ffi # ss) „ Wft&ffi&o* 
m B tc e « « n r s «i ig 'it #g & t>- m m m w . w fc ® 1 © « gt « ib ^ *§ * ^ # § c t tc 

i: , tKDB, flg ( ?L 11 % # tr ) , ^ ^ ft ®i « , ffi A SS , V ■ , MRS*, H g|5 / SH g|5 

[ 0 0 7 3 ] 

(»-t Fd*->7 - -t? -> r- ^ n A P 4 5 0 ir77r 5 U - fc M 31 t* £ tx 3 3g - ft Si 

rciiipiiifcfg^^siifflflS&tffflfsucite^T, # a be © — otfst5in-f^i8it77 
r = v-icnmm%mm-ixmmmMmimVicD&wRzffemicGm&s ^m^mm-^m 

itfls, si, sias/sap, mas, st/w *a bs a * t- a st -t § c t 

, liiy-f >7"D7 h # W fc <£ t)SinT^5. P C R^-XO,M^X^ u — - > y 
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[ 0 0 7 4 ] 

^tttfflT'iS (Hodgson, Bio/technology. 1992, S 
e p t 1 0 (9) ; 97 3-80) . SO M % T* ti , ^ M , -r*t>-63SiaiJ-«M»jR* 

lO^Kx-ffciSfc* th<DH, m ( *L « * d tr ) » ? ^ m « « > ffl it , ff §£ , 
BJWH** HSSB/»aB» fl9SP> & tfffFfflKM*T?8iSf*ci:a<jnSftT^ 

[ 0 0 7 5 ] 

3 15 V * ffl £ -T 3 ft *b tc , «fl6e3S9l-f^W»*fi:WLTX^ V - ft o C i: T- 

* s . £ £ fc c n & o {b ^ $i ii . ■&«9xtt8*»*«i«j3&K:i3tts, ee/3&s*!pjj£T* 

ft 46 tc M M "t 5 c £ # T* # 3 . ft fSl > S6jSHJ-'HiWWjR*S*U^efi*1?fc}S1S<b ( 
T 3 -7, h ) X ti ^ ?S fb ( 7 > * :f ^ X h ) f 5 £ 3 fij S £ *i * o 
[ 0 0 7 6 ] 

tf , flii © |g - -ft W g$ *§ ) £ x * u — - vytscicffli^c tA't'f coi^&z 

7 1 t* ti , - as w (c a % ig #j - rt m m m 2 > ? n x it 7 ? v * > v # , 

> k ft . c a m p ix m m Ie > 7 t 2 ' - a> m is * => - m s -it f b * a <o m e m ic m m l tc n m 
© £ ? & . mm - ftmwM * > z n. t $ - y v i- £<Dm<DtuE.ttm<»£.it¥W}&m*tfi 
tb -r « c toT'Ss^fff , m #j - ft §u bi is * > ? n t m m it & m t & m £ n 3 x s 

* $ £ n 3 o 

[ 0 0 7 7 ] 

MSft^fti: LTli, W * Wf , 1 ) A ft SB I gOB^^T'f K. SO*7 y^A^^f K 
'J ^ttfSJigt^T'f K (^J^.!f , Lam e t a 1 . , Nature 354 
: 8 2 - 8 4 (1991) ; Houghten et a 1 . , Nature 354: 
84-86 (1 99 1) #«) , S(/D-Sr//X(U-i75/i^f,T'f TV^I 

& ^ t> it ¥ m m ft 5 -f r 5 'J * €j ^ r ^ k , 2 ) 4-> x*^7"f k ( m X- & , 5 > ^ 

i», XaW^WK^aifeihX^^yf H7-<y7'J. Miltf, S o n g y a n g e t 
a 1 . , Cell 72:767-778 (1 993) #BID, 3) StttCGfl 

U^D-vtai*, * y ^7 d - yjixf*, bh^btti*. M-T ^ -c * * -f ^ttf*, 

Fab, F (ab' ) 2 , F a b ftm? 4 7 5 V 7 ? <? * > h mtAtri#* St>"Ct<* 

ox e h - -?m&7 =7 y * > h ) , 4) 'h?^siRtfii»^ ceijAff, m*^t>-&s 

[ 0 0 7 8 ] 

* S « M <b ^ fU , SK»i$i:»#t5S8*OBlStt77 7'^y ffio^lifb 

sait**-r**», x. it 7 ? ? * > y & mn t & l mm i< & ^ & $ & . iat«i&t5 

[ 0 0 7 9 ] 
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x»4£{k¥*i/£to¥w*»*$#» c c ic ?\ m ■£ n % xmic & c n z co x. y b* x * y h 

T y -b -f co * - f -y btfffi9&£t\TioK). £ , flfi co « fig fc O l,> T f4 . 3 38 # K *5 V* T 
^*nT£>£>fr, Xf4iaffi, «pfc:02©18*fi%ffl^T, gKctl^t 5 <: t T- 1 5 . #fc 

, ai ffj - w » jr * » ^ * iiw sa x ti i§« o ^ w ^ w « (iE tc o v-» r © 7 y * -r * ?t 9 c t 
, h (?Lii*^tr) » ^-gftmrnm, mum. was, mtna, saos/saB, 

[ 0 0 8 0 ] 

& tf / x « s it ft ft & m a , * t^mm-ttmmm* >^irn*m^z> c tic & ox 
oco^Kii*^^>hcofpin^x{ia8saF f 3Xtiifflflan^-7 p cofRjti^, aa'5b;i/#+->* 

i$i|ffl!jSrt K^-f yoi a i&Iil K > :£ f* & 3 ^ f4 t>- 7* U ->* a > X f4 ^ co — g|5 14 . s 

fcilffffltSiOk LTffli^nsckA'ff « c co i§ ^ . ^ S$ co 2g #J - ft »l 81 iiuc <£ 
■3 T^lSnSo LftA^T, g&3-*l<0{I*§{5)gj£#;£:, SttftcOx^K^'TVhT' 
•y -b fc L T *U ffl 5 C fc # T* # 3 „ C nic <t 0 , MSiJ-ftitJBI^coiSjfe^&^Jtcofgi 
»BWrtOtOK6^t7 y -b -f & ft 5 C fc rJ f?£ fc & 3 „ 
[ 0 0 8 1 ] 

* 96 m co % ffj - ft W IP * * V ^ ■/ K f4 S ft » IE »J - ft Si H US fc ffi H ffl f § f t to ( W 

* f4\ «$»¥stf/xtt';!!fyK) *«a-r5feii>»8+*ns^ffit?a&*«*is^7'y 

■b -Y fc fc * ffl T? £ 5 o c co tb fc , ft £ to # U ^ 7 ^ K fc ^ X f4 *i 5 ft ffl T- f? S £ # 
T T- > ft£to£»»J-ttWIPtt5#y K BJ r§ M S"J - ft Hi 'J 

7f k *• is g ^ » * tc aq * & *i * o t- x h ft to «« pi ^ tt m - ft a* m m * u ^ 7 * k 
fc*aa^ffl-r*«^-» si #j - ft m * * - y -y bfrZBfS.-znzm&tetDm, x t± s tt « 

to^^^-r^Ji^tc^fflT-fe^o ctDtcib, h tKz, mm- vimm mm® tm% 

? V ^7 K 14 , JtltftSSHtWlt-Lft^yf K BE W * ^ ts «fc -5 fc S ft £ 

[ 0 0 8 2 ] 

ft fll fl& % m M <D X * 'J - x y ? T -y -b -f % R -5 i& fc it , £ y/^Mco — 7? X f4 M 7? co # 

a^fk»»^6oa^ftjeisso»Pf*st&^ftL, 7 >y -b -y «r g id ft r * fc k > ag $j - ft 

[ 0 0 8 3 ] 

^ XU X ^ U - x y >? T -y -b f fc ^ T « > v h 'J -y ^7 X fc # > * s * q ^ {b f S S ffi * 
f fflt^C ttfT'f 5„ £ § 0« T- (4 , j> y ? a fC V h U -y ^7 X tC Kg £ f 5 C i: <D T* t 

k^y y*w*DLrcBi^^y/^i*f 5c fc^T*t5o fly a. t* » ?y^ftvs-i-7 

yx7i7-^S^^ y/^Ki, *y-b77n-Xkf-X ( s i gma Ch 

emical, St. Louis. M O ) Xf4 if Jl> 2 =f- * > ii? -T ^ a £ -T ^ - ^ 
U-h±fc©*?n, fflBSi$g?to (Millf, 3 5 S-l a be 1 ed) IS M f t ^ to 

essn, ^^f*0fig^fif c w a wr » iRtfpHoti*«*ff) ©TT-g^ftiWy* 

^^-h?tl5„ ^>^i^-htO?g, #!|g^7^;b©|&£<DfcA&fCtf-%t4St#£n, 
VhU-y^X^rH^ftLT, ttWtt?^;!'*!^, X««^4*%»*Lfe»0±a**iM 
^f^clfcfCcfcO^ffl^n^o S^l^f4, l^ftliS D S - P A G E(C);!3 V h 'J i-X 
A^^gSf^cfc^Tt. Stpco«m?*»)SW5^ffli/>-SCfcfCc}:oT. VJlfrbV-X? 

, #y<7'? HXttfO^-y -y h ^ ? co fc "fc, £3 ^ ^ , ^IS#(CMl^lcoeffi^?iJfflLT, 
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gJSL, 2 is * M £ 2 - ? y h ft ¥ £ <D & % M If ft ^ tn.W It . ^ 1/- i /Kci 

j» ft 2 ft , C<D$>^?mtefa&£<D&£lC£9Z<DV3i/l<<D<£lcMbz.t>t\5 0 %zt?l- 

i**fttti-r*j*ftfcLTt4, GSTB5£«£t*K:«fcSflljZE©3?ffiti:j!«*."t\ g »J - ft m W 

h^-^msfsiioa&sfitf*, x ti > is 9j-ttw»jR*v 

[ 0 0 8 4 ] 

#fgiw©S6sij-ttwiJs*<D3 i -oz&m? zmmi*. ±&<Dft#rJim%m%i&z>^te \o 

ffl»§b*Tffl^5ctKJ:!)HSt5ci* ,, Pi5. — IK » fc tt , flk$]&£$tclig]&& 

1 (c ffl ^ 5 C ttftf 5, 
[ 0 0 8 5 ] 

mf-^lciSi:, tl-Ol, SS ( ?l !E £ # ) s M A > S'JWH 

m. mm/mm. mm. Rzsmmmmmv^mt % c tw&z nx ^ & „ c 20 

n 6> © i& * 7? tc ti , g m m 1$. m f <o m m - « m m m s tt <o ->* a u-**s#©j&«fii 

|b] £ ft & o 
[ 0 0 8 6 ] 

*5Ewofl&©«jSTtt, aiffj-'reiwHSjitkis-&x»±*asf^ffl-r*, x a ^ #j - k m m m is 

•14 lc M a L, T !/-> S ffi <0 £ > M % 111 f § 4t> fc N 2-a^7'J -y K T 7 >y -tr -< X 3 - 
n^7'j7 FT'yb'f (U, S. Patent No. 5. 283, 317 1 Zerv 
os et a 1 . (1993) Cell 72:223 — 232; Madura et 
a 1 . (1993) J. Biol. Chem. 268:12046-12054; 
Bartel et a 1 . (1993) Biotechniques 14:920— 30 
924; Iwabuchi et a 1 . (1993) Oncogene 8 : 1 6 9 3 
-1696: and Brent WO 9 4 / 1 0 3 0 0 #i) JCfe^T, iglJ-ftlK 
m 2 > 2 n & l"b a i t 2 Z/ ^2 %i t l,tf ffltSC t^T't?5. C <D <t o ft m m - 

ix%tmmm-&* zs^tms.. mz.i£, Tm.m? £ <Dmm - ixmmm2 > ^ 2 n. 

HSffJ-f^WSSR^-y-y h- ic «fc 5 {8 ^ £ 31 tc H 5- L T v> § „ $> § ^ ti , C © <fc ? ft M M - 

Rmmmm-s 2 > 2 s n , ai»j-f^*»H5fia*»j"p**c. t: # & 5 . 

[ 0 0 8 7 ] 

2 - >\ -< r u -y k ■> x f ii ti , »*Rrffi*DNA*s^K^'ry&t;jSttfkH^f>^5/a 

3, *gp#<D&¥SafiflB?©^^a7 — ttfcSS-3v>Tv , >3,, fffi IMC S 5 £: . CCDZ-y-fe-f 
ftt2O0Sft5DN Ail«:fijfflt4, — <D ^ ifi fc T (i ^ $ M - iX » & iR 2 Z/ 40 
^S?r3- KLT^SIg^li, K»l©l5¥iIgPH^ ( 0IJ * ti G A L - 4 ) OD N Agp 
K^-Y - K<tLT^5ilfif iCi^?n5. ffi # © fll ifi fcfc T tt , D N A 8^7 

'f 7 7 U ^ t# e> n , *Sli2£0^>/^i'S ( r p r eyj Xfi r s amp 1 ej ) £3- 
KltLT^S D N AS?!**, Kt»l<De¥liiPH?©S14ft K f > * n - K ft L T ^ 5 1 

fEflcB^jnS. Tb a i t ? y/^gj Rtf T p r e y^y/^gj ^i^rttis 
Ie ^ H AS W ffi fc «i "J ftg tSc y sK - ^ - a e ? ( W A if L a c Z ) OK?4ff7Ck«i7* 
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[ 0 0 8 8 ] 

# BJ! {J £ £ tc , bu co x ^ g-^y^T-y^^lc^oTH^^nSM^&^ftHcia-rS 

-rsckt>*awo«!HrtTfes. w * ar , *f8»!T?rci5£*ftfc3!S»j ( #j * if ie sj - ix m 

m - ft S8 S# % *S £ W * ) « , Cft60*9JOffl»lc*»5WJatt, « tt , W fl? ffl * W JE "T 

Ifi 8"J ti > c© < fcd*Hffl©ft 5 ffl©**->CZ»*Bi^-r ; 5fci&fc, iftXliftiJCitr/K'S 
bnscttft'tS.' £ 6 fc , * 5g BJ§ ti , c c (c SB « £ ft 5 i£ S CD /c 46 X * u - =. y f 10 

[ 0 0 8 9 ] 
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i^^r^o) , f^rtiid^ wans, vi, win mm. mw>/mu. mw, 

Ifffflto^T«fflt5Ci*'-!?ir4, C © <fc 9 £ 7 >y -tr -f it , * ^ ffi flR % X « fit # 20 
[ 0 0 9 0 ] 

it ;U co * > /< ^ S % & ttl f Z> 1 oolSJa, ? y^^SiclWWtcg^ts c to 

imjs, #*ft*„ 

[ 0 0 9 1 ] 

* ft m <o ^ r k (± , ^ y ^ k ^ s i* ^ o « * © ^ > s co m is > ^ixa^ns 

.1 (^m^X7 p 7-i'->v^^iIii!)60^^^i;§) , Rtf^iaa4BaR»<DlJW%#tyo 

is, c <d£ o &t -y m & ta & m , X it & & * y t i"f 0 & 5 & & m & J& & -e 

<=> t\ Z> o 

[ 0 0 9 2 ] 

^ •/ ^ F © I n v i t r o^ffiSit LTIJ, »iR»e^ft3SB»«#J7'-ybl' (EL I S 

As) . ^xX^V^a-yh, Jfi f* , X « * y >7 S *S & M <0 J: 9 & & tB K ^ * ffl V> f= $6 40 

yco^tHMS^^^^tcjSA-r^Cfctcit), M^#cOi n v i v o^Ui^ff? H t*' 

oT^atiCii/C^SfiaUttt^-*-*. 61 i* ^ ^ ))/ t 2> £ £ # T* t -5 . ?SSI#lc 

7^^>h*«a-r*^rffitt, w ffl t* $> z o 

[ 0 0 9 3 ] 

ft <0 « IrI £ , ii^Sltftt h©Sf f llCf^oT, ^ M K ^ L T (O ^ *5 if 5 « L V> 
SeM^KCO^TB^WlCliOa^. i^tf, E i c h e I b a um, M . (Cli 50 
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n. Exp. Pharmacol. Physiol. 23 (10-11) : 983 
-985 (1996)),&rj?Linder, M. W. (Clin. Chem. 4 
3 ( 2 ) : 254-266 (1 997) ) «#S?nfc^. Cft6>©gg©BgffcWft$g 

, »3R*«W«*S*9J©}Si4tt, *»J©ffffl<03SlfitJ«IHUc»*'r*. C © J; ? (c , f@ 
ft ft It £ Xttc©J:iftft:**oa*«4ia4loa««BH6fcT*. it € ? $ £ tt 

§ o LftA l oT> ffl B =? © & 14 *i . &*&H©SI#J-ttW#JR«te©loW±#ffi©* 

n ft ^ „ d©«t^tc, ^7 p ^K(i^«atci5s-r?.iiiE?©^H^6iis-r§/c46©^-y 
ffl us w k * -r > & d* / x a fa © at « £■ la w ic is it % s n *s s ft , Rtf^sj-ttwi** 

fC ft *> 3 © t L T fi . lttO*IfiO'?7*f F^iStSCt^ff 5. 20 
[ 0 0 9 4 ] 

, h c?Lii*«cy) , ^-gftmrnm. m$.i&. mm. mm mm. m&/mm. x 

[ 0 0 9 5 ] 

i/> s *§ ^ x s#*^-y7 h^/f Ktis^L, mns&iit # > n t&< & & Lit*,* & 

•5 ft m # , SH*ttI««t*-y-y h^^f Hkfi^LTV^. iit<*)!)^-y7 F^/f 

K£HSIW{cffl|B]tt©8ftv>ffi©#>/S*Si£^-g-LT^Tt>, ^©^7°^- K Jit f* © £ — 
y -y h £ ft S ^ y 5=- K (c 3^f L t 7 7 y ^ > h X it K ^ f > fc 5 ffl p] ft £ ft L T «,> -5 PI 
0, ffi ti S ft ^ :/ ^ K £ D* o It 3 £ # * e> *x 3 <, C © if , ^ 7° f- K (c ,1£ b T 

[ 0 0 9 6 ] 

c c-effl^ensi^, fotoit m& ft m ? m& %ftx ^ % &<d t.m v ft, c 

>/<*Sn?&S 0 # #8 0J3 © In <* ti , ^'J^o--^;l/j5tf* x *y*a — -*-;l/tn;{* % & t/ F 40 
abXtiF (ab' ) 2, S U* F v © J; 9 fiftO 7 7 ^ ^ > F *^ * n 5 ^> C nic pg 

£> © T* fi ft t> o 
[ 0 0 9 7 ] 

* - y -y b 7" K £ f# 3 fc *6 © , *t{*©£jS&tf/XttE)5£fcOl,"»T, £ < © 75 ?£ # 3b 
€> ft T 5 0 C©£?ft7?iS©l^<OrfMi, Ha r 1 ow. An t i bod i es, C 
old Spring Harbor Press, ( 1 9 8 9 ) KEUtlT^S. 
[ 0 0 9 8 ] 

-ASIC tn. to * £ l£ f %> 7c 46 lc it , ISI^7*f F^ftSSiLTffl^, ^Ulf^yF, 7 
■9-^X{iv-7X©J;7ftBl?LSS^1*{cJg^.$n§o ^fi^V'^ft, fit^14^y^F7^ 

yyyhxfi ^^Htfffli^e.ftSo mciS^7 7y^yFti, i2(c#s?n 50 
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X ^ Z> £ o ft « fll tt K * << y * ti /< - f % t> <D T* & 0 , $ , ?y/^S77^y^Vh 
[ 0 0 9 9 ] 

2 n a <, at # « , cc»cSB*sn*^^^KoiBrn©iiaw^'5t.»H"r«ci:^'e**. 

L L ft # 6 » #»4fil«fcLTtt, «te/ffitt&tf/Xttg»J-ttW»**S^*f**15 

T* £ . SJiJ77^yy > H4««rSnfeWllftBB5>J77y>c > h*»£t50(cl^5c 
ktfTSio 10 

[0 1 0 0] 

RItt7?y^yHi, - 1ft « fc , / >ft<fc*>8O<0Bil8-r.57'5/&5£a£^A,T^.5 
. LAaStf^^Ml^tt^yf Fli, I 0, 12, 14. 1 6a±«75/i 

bb ?u © «p a e ( 0 2 # i>8 ) ica^i^TiiiR-rscfcjbt-esso 

[0101] 

) CioTgglctSCttfTtS. & at BJ AS 1ft K © 0>J L T (i , Witf, l^Ogl, 

a * # ^ m s m^t^mn. #s ft m s , £ w « # it » k , atfftittfts^tins. » 20 

ii^ilOii: Lt(i, ■b-i'3^7-9-t:^;l/^-^->^--lf, 7 /I/ * 'J * X 7 r * - -tf , p 
- X => 5 Y is # - -tr\ XttTtf ;l/3'jyixf5-^* , ^sn, ft? jg ft ftf ^ # ^ m <o m 
tLTtt, X h U7'l-7(fv ! >/(f^f Z fc" ft , W Jl ft ft 1t 

1ft St co #J i; L T «: , ^v-^U^xa^, 7;l/^-U--tr-ry, 7;^ b-b-i' y • -fyf *->7 

V & J& > P-^^V, *J ? a a b V T i? — frY 5. > 7 >l * U-t > . Jfyj/^Jn^-fF' 
, Xti7-f ni'J h'JytftJn, 58 ft M M <0 fiflj £: L T (i . ;l/ * y-;^$Sti, £ 1ft 
« ft It 1ft % <0 m t L T (± , ;b->7i7-t\ )\> is 7 x 'J y R If X * * U V # £ ft s L 
T , ig « ft $ I* It 1ft M <D #J i: L T & . 1 2 5 I , 1 3 1 1 . 3 5 S. X 3 H ^ £ ft 3 

o 

[0102] 30 

Sttttt, 77^f^DTnr?7^ xtiftffifcfcl$<a<kdft«ip<Dfi*fc:«koT, * 
»fla<0*>''^SOlo%#ltSfti!)liiffl^i:i:* ,, e*S. ta t± . *ffl AS e> <D 

co ph m P£ * & -f ic ffl ^ 5 c t & x t § o ia i co * m t-* - * «& 5 , *iew<oiii»j-f^ 

t hcdw, j» (?Li«^#ty) . ^-gftjmmm, masa, mm. win 
. m / m as , £ is » is $* » m as , Rifmmmmiax'&m-t z c tut. &ms -*r> 

7 a v h^WCiOfx^titV 1 P C R ^ - X O ffi ^ X * 'J — x > y / Wwl/ T* « . MX 40 
RltSCttiStiTi'S. ^8tc, Ct0<fc9fttr[<*{i> SUKoa&tf 
■r§Ci:{c«toT, in situ, in v \ t t o , MM'&ffi-'Wl) . IkTS X <D 

C i: # T* # 5 o 
[0 1 0 3] 
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m (?Li«%^cy) s T-gftf&mm. mn^. mm, mm mm. sass/sa^ 

^H<Dif?iGE<D7>y-ti'-f'£:fT? «> fc ffl v " 5 C i: T' # -5 „ 
[0104] 

fit # a * fc , ^ttrtosiusiafcfctts, fflis©iE#x«siir*Hifl&/hstr©ffiii*»fiB 
s ?grtJ!*«, m&w, wu. mwn«, mm/&m. gjBttisft, iff amiss 

f&mvftmt & £ tffTjkZ tlT\t*& Q IIHTWdili, « £ ? © 7- X h /c T* & < > 
?6^ffi^^x*-?-3Ct&c&jSffl-f3Ci:tfT'££>o Lfc^oT, ii& * a » W fc » 58 

[0 1 0 5] 

J5ic, fit {* a y y a m m ¥ ft fir k & ^ t * m v $> s . l «« o t , £ m * > * jt tc n 
^. fit « si * a»» ^m^u h y^i/>^yfK^(t, ^mftwias^T m*a 



[0 1 0 6] 

fit # a s > ffl « §y © # si &c & *r ffl t- & 5 o 



1 CC> 
WW 



( 



46 fc ffl ^ 3 C ttfT'tS, 
[0107] 



tfi T- % £> o fit *M4 , 0>J *. , tt^^^ay^-rSCfctCfcfJ, FiStt©g» ( r =f 

t-fXxiifyMt^X) ^Sftsctci^^tt^TIS. fit t± > 88 f£ © ft: & 



II & g|5 fit % ^ t? 1# © 7 5 ^*^< > F- tJtLT, Xte*fflflaxt±*fflB$fliii: 



2tc, *»WO^>/^^f|KBB'r*«IJB l itllR*«'r 



10 



20 



30 



[010 

* 58 W tt 

;l/ ^ "p ^ 

[010 

* 18 W (i 
Ktt^ ( 



8 ] 

■So + -y 
tifitt*^ 

9 ] 



, $ p. * 58 US © £g 3SPJ - \\ m M M 7° f- FXfi?y/^g*3 - K ft f S * he t» 



h ictt, =7 ^ ;i/ it $ tx , X U 5 ^ m K fit f* t , ± % ¥ W * > 7" 

tti 7 1/ (?) J: ^tc, £g<<Dxfc? h-7°£D9 ^© 1 o^r^tH-TSJ: r> lc m 
^i> 0 T \s -4 t Itli, ttiTU^i^asn, Sfcfit<*TUi'(Dfci6 

^ n r ^ % o 



40 



50 
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[01 10] 

cc-effl^&n**^, r^gij mm ft? tit. «»©3casfi»fc*$ 

fr6»iihftt«)1?6i. # £ L- < 14 , r^gij & & (i , ^©^BScDE&^k&SMco 
3" 3 E 5>J ) t± * * o L L & A< £ , 0U * , $f)5 K B, 4KB, 3KB, 2KBXS 

< o co 7 y * y y ? v is* 9- f e ?»j * « a* , - © it e =p <o e ?u tc <t o =j - k ft * 

*i ^ 7° ^ K « , yy AE5>I4"(0>I'> h nyic i»)»iai2tit^5. £ S tz jS ti , & B£ # 
-oil, «»E£J<ofc#©ffi©ffiffliSfc&9«$c£tf^£*J;5fc, il < co II g T* & 10 

[01 1 1 ] 

flflj * HT % g^/cDNA^fOi^j r#gtj ^ # ^ (4 , ffl*SI*.8fS*«:J:»J 

KJiSttsw^fctt* <a in us n h . x « is % * *fcft^Wfc-&fig^n-5«i^(c(iMigi* 

X»ifl&©{fc;*1*K*£KWK:$**^. Lfrb£tf&. co«»»?li, flfiOn-KftX 
(4 SI SI E ?Jtc R4 £ 4a § c ttfT^Stf, dft « # « S ft fc © L T # x. f,ti5„ 

[01 12] 

#giLrcDNA»^©fi?ijtbTti, #ffflraw*s±«BJ!afi:«Kf«n/fciia#tft*D 

«ftfeRNA»?i:lTli, #»BO$ILftDN A»f O, in v i v o X 14 i n v 
i t r oRNA|?ft*'tin5. #fgB;Hc4:0#^2ftfc^6§5j s ^kLT«:££tc, 

BEWtcIiilSftfctf^a^iiftSo 
[0113] 

L fc o T , * ft W t± » 0 1 X (4 3 [ S E Q ID NO. 1 ( Is ¥ HE £U ) > RZf S E Q 
1 D NO. 3 ( y / A BE ?'J ) ] K fc T * £ ft 3 3* * b * F E ?U £ i£ 3 & B£ # ^ , 
Xli02 (SEQ ID NO. 2)fcjS£tt3*V/<*K*3 — K{b-f3«»#3 1 *Jfi{K 

-r s t> © -z? & s 0 ? * \s*=f- v & mtt c <r>i&mft =? (dZcSlk. ?■ z 9- ke?j-p*« 

[0114] 30 

* ?g B£ (4 £ £ K , 01X14 3 [SEQ ID NO. 1 ( 35 ^ BE W ) . SD'S EQ ID N 

o . 3 c y / a be ?u ) ] t ts v r ^ s n *> ? * u * ^ f be ju £ * a w t j$ a m m ft =? , 
xai2 (seq id no. z) ic^zftz z y^tnitu- v\t-f z>®.wft=??tm<& 

? Z> t © T* to %> o ^ B£ # ? lc !/■> T , m. W tc c CO J; 3 & ? * * 9- F BE F'J C* < "f *> 
© ftf *P ^ BS & £ £ t i> IC ft tE ? %> £ % , S Bfj » ^ fi 5« ^ U ^ F BE W ^ M W iC fig 5 o 
[ 0 l l 5 ] 

*5gB^l4^?>tC,llllXt4 3 [SEQ ID NO. I (gfi^J) S E Q ID N 

O. 3 (y/ABE^J) ] fcfc»^T^£ft35» *U#*-K ETUfc^Cf^EBftf^ XI4 02 ( 
SEQ ID NO. 2)fC^«n**V/<i'««:3-K{fr*S»» : ?*ll« , r*t»<0-e 

* * . «M« ,; F©««FW**S»EJ<JO / >*< k t> — SBtfc:<E>1£B£BE?iJT*fc-5*l£, BS ^ 40 
ftt?*U*f KEM*^tf. c tile 4 5 i: , ^ BS » ? 14 , *OJt KEWfitf ? 

* § *^ , xttf^*pw*«»as» en * tf , iJssK:ii8-ra«KBB?>j, x t4 ^ « isi w * m m be 

fiJ^^t^C itflTSo CC0J;?^:^B§»^{4, c:< tif ^OftfJpWft? * Utf K* 
[01 16] 

El 1 R tf 3 tc 14 , ra-F&tfll 5 3-F©EJ'iJ©P5#tflI«Sft*. *RiiOtFy/AE 
?U (03) , Stfc DN A/fi?E?J (01) <fe0, 0fcD*£Bg#^(4. ¥ / 1* y V u 

y&m, 5 * k 3 • OM3-HEJIJ, stf^a-FiefiE^j^t/. 

T't/^c — AS tc . 0 1 kBI30PS#»c|5*«tiTi/^*£:©J: ■5*E5>J OWr»»i, 50 
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IC *J ^ X 4> *D <D It W V - IV * m ^ T § £ fC It 5£ T § C ttft'tS, W T T* IS f& $ n 5 * d 
fC, ^<0*>0»3-f -f >y»fi, ^fcya^-^-cDjc^&jtfi^tOPIfiH^fi, 0<J 

* f£ , lfffiB«4lfif»lO«li, Hs^ffitt^^ltSft^fti^O^-y-y US O 

[01 17] 

* m L fc ft fit # ? « , fig m L fc * > /< * « fc to * T to 6*1 7 ^ / * ASSS Sfe 3 vMi * ;U * -> 
*JS75yi, XttSSi^T'f K(*307;;8 ( 0>J * tf » il!)e!Bi!) < 1 «±o^yf Ki 

* * "T 6 i§ £ ) 5:3 - K<tt5 C t^T$5o C © J: -3 E W *4 , M8#j!i» Lfc« 
fg^co 2 y Mco :/ o tr > y*tc T , *v/<*JHRii©ffiiI, >>/^J!*j(WO 10 
5£ & £ JUS , Xfiy>^y«07-y^^Xf±§!i^cD^cD^ft0^4Mt, X fiftfi <D iff 
»»C*5tt*IS:SiJ*jRfcL»*o - 08 fc v i n s i t u <D«^, ft to 7 = 7 fig ti ffl 8$ BP X 

[01 1 8 ] 

± a? l j: 5 fc , * e t fc & is # =? it , * a T- x aj - « m & m ^ r k % 3 - f it u r </> 

3 BE W , fig $5 L fc ^ y F * 3i - H <b b T ^ £ BE 31 , f LT, ± . X fi » IE 91 ( 0J A fcf , 
p r e-p r o , p r o-p r o t e i n BE ?U ) <D «fc r> & it to W * 3 - F BE 5: # t? , 

c ft fc PI & $ ft % t» <D T- ti & < , ft to #} & 3i - H BE 91 ic to * X , WiiqWft*3 - Kfijij, 
00 * tf , -fyhn>t^3-F5 ' & 3 " E 5U <0 <fc 5 ft » Kf 2n^iooaiRtt?n 

-f (C , li ^ , m R N A 7° a -t -> > y" (X 7°7-Y > yRlf* 'J 7f ^;HUttf ) , 1) 20 
y - A , StfmRNAOSSfiOl8l!*Sft'rt>0* > t t t S 4 < T S ^ 

o to T , ft fi§ # ? fi , #>J * (£, »»*S»fcr*J:3*^:/^K*3--K{tLfce?>Jfl!> 

[01 19] 

mflkLrcmMft^ii. m R n a <d i 9 ^ r n AcDjgffiU S§^ii^n-;yy, ft fig 

SiXtt*Ott*6b*ICJ:oTt)St« c D N A&tfyyAD N A * # t? D N A<D&1& 

5;tt)fH. ft bs , w tc d n a it , - fi fi , x f± * m X- & <o m 3 o m m <o & m « , 3 - h 

$1 ( -tr > x m ) > X « If a - F « (7>f tyxl) 
[0 1 20] 

?5rfii«-r?)t©T-fe?.o ccoj;5&*£Sg#^f±, Wit^^-^ (R — ffiH) x /^a<7* ( 
mftZ&m) £*frha? (S^r^^f*) 0<t-5tc@^{c%^T«*^ <&3t/^iffl^&;L 
D N Affix tift^^-figtCck 9 TtfiEJtiil5 0 c<D<fc 3 ^lfi^fc58^t-5^Si*f±> 

L ft -j T , ±i!iLfc«fc3tc, aa*Ktt7>l/!tf HOIK, ^ ^ , ffli, ffiAA^S 
n^o SSfi. 3 — K , ^ 3 - K ffi ^ <D if -6 6 XttMSTECSCttfTtS, t5 

[0121] 

*#gf!/m££»fC s @ 1 SO'3IC,T?n§ii^f 0^3 - K©7 7^'^ y h^ffl«t§t 40 
£OT'^i> 0 Jfi^lf 3 - K07 7^^ V h t L-tti, ya^-^-BE?iJ. x>^>-9"BE^J 

fc46(DX^U--Vy<DP^%{Ct$l/'>T#fflT*^-So y p - 9 — It , El 3 co ^ y A BE fC 
*5 ft 5 * *> 5. A T G ^ ^ {4 fC *5 1/^ T ^ ^ fC 61 |g ? tl i> <, 
[0 1 2 2] 

7 7 y ^ > Ki , 1 2Xf± ; enJ^±<D|^S-rS7^U^-7 c FBE?iJ5r-g-tyo ?5>tc, y^^" 
^> Ki'>4< ti 3 0 , 40, 50, 1 00, 2 50, Xtt 5 0 0 O? ^ KIT* 

& 9 77{f^y hOfi«tt«ffliWicS^<. fi^J^tf, 77-r^y hii v H 

(Daieh-y P ^Sffilg?5:3-F{k-r^Ci:^-c?#§^, XliDNA7D-7SO'DNA7 50 
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7^^-HT*lt'$5. COia477y^VKi, ^Urr^^^^^K^n-^* 
^•fi)c-r^rci6cDK»]^? * U * ^ F SB W * ffl V> T * 91 "T 5 t ? £ * „ 5 iHt * ft fc 
"7° P - 7 « , 3-K{ktR^tW^-r^^^ ; &#^it-§/ci6 s cDNA^^^^'J, tf J L. 

-Y t — 14 , ae^O#^^Si§?%^a — > f 5 /i * « P C RSit»(cffl^5<: tA'T'f 5. 
[01233 

ya-y/y^-r^-li, ft §u W fc (4 , *B«(L«8?hft*iJ > FXtt*U 

dtyi ? Is* =f- F ^7 £ €? o * <J =f ? * U * F tt , ftSWtli, 4> & < i: t> 1 2 , 2 0 

, 2 5, 4 0, 5 OXUtnWlOlStW^i'Utf KK, X h 'J V h S^ttTt 

[0124] 

#7l/hnif, **ny, Stf;tt:ft85Sf*tt» ^ IS ft ffi # W ft ^ T ffl ft] <D 7? i£ «: ffl ^ T |W) 

F it ? 3 ? * W * F S3 ?U # * , 0ffi^^^n.5?^U^-^-FBB?iJ, xacoE?J©7 
7 ft M ft IC It , 6 0 - 7 0 %, 7 0 - 8 0 %, 8 0 - 9 0 %, <fc 9 ft 

SUfi^Cfi, '>^<i:t9 0- 9 5 %J^±CDffl|Sl14^: ; fi'-r§€><D-rfe?>o COi^ill^ 
? (i , HBtCTFSftS^^U^^FBB^k X (4 C SB » 7 7 9 * V h K tt L T , * x U 
- h * 0k ft *"> ?> X h V y i? x > hftllfOTf 'W 7"U ^•Y nJ^S i O i L- T , g ^ 

tc m $g -t & c *: -e t ?> <> wassftii, 3- Kit Lti^ie? fiiict d, 

g LTiS 0 , cna S T S RZ? B A C ^ v 7?- 2 <D £ ? &&&ifc<DmMlc & ^ T SiftX 
[0 1 2 5] 

a 3 tc « , * % w © n $j - \x m m m # y *\ >7 n * ^ - f it t t ^ % ji e ? & ^ t h w « 
nftsNPta*^t. if A/^^i^sft ( " 1 nde 1 s" ) ^tt?s n pa, 4 5© 

Sift5?i'l/tf Kffilt'StiJnft. C ft £ <D S N Plcit)gC«75/SE5lJO^It 
(4 , a^/'i-tiHef 3-K, R t>* IH 2 jjs £ ft 5 # > /Mr » BE ?U £r ffl T § ^ fc 5t IS 
t5c ttfT'f 5. X * V > , -YVhav, RtfORFcD^fiJCD^-ft^ftOSNPcDffiS 
14, ftlftlOSNPiOlHtl^DNAfil, R ^ <D 14 31 <£ D f# 5> ft £ P*1 #i / lh , 30 

[01 26] 

# , s ^ to * < i & 6 0 - 7 0 % £D *s iq 'is % w -r § m & K. >n ^ fV 9 4 X , R He 

fTfc>ftS*fl : *j!ti*LTV>*o C © ^fe T? t4 , A^^'JiC'fXLTaSEWtf, '> & < t 
t«6 0«, '>4< i:t,^7 0%, '>4< tt^I8 0%, Xt4 J tftW±i:35:5 8 cfli^ 
^XF^yi/'iyhft^tttt, S*#tC*>^Tf4^l»]T-$> , 7, C u r r e n t P r o t 
ocols in Molecular Biology, John Wiley & So 
ns, N. Y. (1989), 6. 3. 1-6. 3. 6 IC gE *t £ ft T V> S „ 7. r- 'J > 40 
>>' x > KiA-f7'J Jf-fXSftO 1 O O 0!l T* it , 6 X m it i- b U t> A / x y m i- h 'J 9 
A (SSC) tf s $J4 5"C-e/W7*y2'-<XL, ^Ot^, 0. 2XSSC0. 1%SDS 
<f > 5 0~6 5°CT}5fe^-rSo (5X h 'J > ->* 1 V iz-ff)* r h 7" 'J ^-rX^f 

<0 fiflj 14 , ^ IS # fC *5 1/^ T S » T? ^ o 

[0127] 
« §| 73 s ? tO {£ g 

*figB^£O^K^^t4, 7*n-7, X7^t-, fb ^ ^ fig (f IB » , Rt/^^^T-y-lr-Ytcfc 
I* 1 1 1 1 T' 3& 5 o S^»?t4, i2iC/TJni^5^Xf K^3 - Fltt^^Sc DN A 
RO'7 s yA7n-y?:#Itl,/iJ6 > &D*02lc^-r-^7°^ FklBI — XaHil/Zi^^f' 
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*'y-b>^+ — RNA, K?/c DN A, Rtf y/A D N AOa^7U ^^-{?- 
is3>7u-7£l,xmmX3b%> 0 EStC^^n^ctdtc, SNP1J4 5<OSi5:§7^1/ 

[0 1 2 8] 

7 n - :/ U: , Egfc^SftT^SgtSEtf^o^Sfcfctt*, Hft©G5>J£t»*HS-r3<:i:tf 

t*t5. l it tf^> x , * ft a 5 ■ # n - f m w , a - k fa m , & a' 3 ■ ^ a - h m w a> e> 

SI m-f % C £ ft X % 5 o t L % £> , -T T* IC 2g fc «fc -5 tC % 7 9 y * > h (2 * fg B£ 
[0 1 2 9] 

& g? # ^ fi f: fc , ^K^^O|5Jft^OfB^^if'li'r^PCR07 P ^-rv-i:LTt ; SfflT' 10 

SO, &E#ft£&a*G?U©T:x^tyx#^©^/£tc:}3^T&WfflT'&3 0 

CO 1 3 0] 

, ^y^KBB9ijo-a5, x»±±»*«?s-ra»s^^*-*'«d*ft*o ^ * * - « $ ft , 
jfA^f^-t-gf*, cfttiteo^^^-^^tciK^^ft^ w *. tr * fflffiy/i,*r*ief 

Stf/X(iie?4IiO i n s i t uREfcSMLS-SSfci&teffl^&ftS. #J * «:\ 
F*9 £ a - K G 5U T* It s ioW±©1$gWfcWASft;fcg«*#tra-K Xti 
£gBt©ffl|WJW&ffi^^*;&MT^$!£ftt#3o 
[0131] 

CO 1 3 2] 

t£ R # ? « * i n s i t u /\ -f 7 >J -f -tf - a > & tc J: 0 , m R # ? O % & » © 
- KftLT^5llEf (03f£jS£ft-5<fc?£) t h^ftft 1 ilCVy/^n 

lyy^^^iiciitiLtts^ cft« s t sstfB a c^^yf-jctn^iic 

CO 1 3 3] 

0 

C 0 1 3 4 ] 30 

(5 L T 5 y # +f -Y A © is If fc t, w m x $> z> „ 

CO 1 3 5] 

o 

CO 1 3 6] 

*> * ffl T' £ 3 o 
CO 1 3 7] 

& B? # ^ (2 f: fc , SBE^^Rtf^^^-F© — SBXti^lW^ra^LTC^sae^ie^lRill!! 40 
$*(©§!* & {CfcWffl^&So 
CO 1 3 8] 

Bs/c, &B??ggi©^£, k/k mm, zrs.v><o^<< •? v 

-■>3y7 , D-/tLTttItfe5. lairollg&'r— *fcJ;Si:, # f§ BE © H #J - ft 84 

*® , m / bi gp , s ss w is , w w , & / it » as m «s t* % ii t- s c £ # , (s m s - V y 7 
st«cit.*«nt^«. Lf;*ST, fu — fit^ mm. mm., £&rpx commit ? 

© fp £ © 4$ W , X « ;b © i$ £ t ffl £ c i; # T- # -5 ,, U-</l/3:i£;£2ft&tSB?«DN 

AXIiR N At'feDiS, L 7c # -p T , CCfCfSacSftS^^^KfCfctfS-fSrn — 50 
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, 5-* fctifclBflS, IBM. ^^^^oaa&tf/XttiiG'F^lf-KiOWfiBJcflat^Scfc 
[0139] 

mRNAOi n v i t r oftfflgf fcli, /-f ^">f t^^>3 V, R If i n 

s i t uA^r'J^V-tr-^aVtftJtlSo DN AO i n v i t r o #t tti & ffi K: l± , 

^fyA^y^Vtf— ft i n s i t u A^ryj ^•Y-e-^a y^tstis 

o 

[0140] 

y a - 7 « . aiffj-ftwBjR*y/<*H*aiB-r*aii!axttis«©!|!ijj£«rff5K»ff-xh io 

*»J-t^W»**3-K{tL/'ca»^ : ?, WAtf, m R N A , X y / A D N A <D U ^ ;1/ 

©n^f-'-^fcfc^i:, * % m © ^ su - re m m m * > * n t± * t koi, as 

ts) , e « m as , Mfi:fii> sua, sagp/sigp, xsr#<i**, J&gp, &d*bt 

x © m m x * y - =• > 9 * * ;i/ t* t± , as -e fs & -r « c t tisnt^s. 

[0141] 

* y - — > y tc & ^ t fir ffl -e ■$> z> „ 20 

[0 1 4 2] 

Lfc^oT> * « li , JR#J-ftMIPiR)l:£?0>«K88L » , *ffl AS & ffl » fc *5 v> T 

?L^^-&ty) , ^rtwjsjs, mi it j®, Hi, m^mm, sHgP/SfigP, J&gP 
, RifiRfflffimm-eftm-rzcttfTiiztiT^Zo c 015 mica, mawtca, m m - ix 
mmm&&<D mzmm-r 5 its vaomti tcov>r t >y -t -r * n ? c t . * 

t§{t^ft%iRist5ci:^tsn5 0 c © 7 -y -t « , iffl sa % ~bl xs m m m % ic v> t n 

[0 1 4 3] 

» mRNA©f§^^rfiJS-r?)^ft(cJ;t)lRl«-r?.Ci:*'iT't^ 0 #Nfi <fc <D f? ft T T- 
NA©%Il/^;l/i:tkfl!2ns, C © J:t $$ K g -3 T , fctUfc^WiiSEIftSSiot^aU 

*c«u<**^«^, fcmit&voitmmmmvi&mm t Lxmfez o & m it & v& n & 

^ M M fg 3t © P§ « SU i: L T |H] £ £ n 3 o 

[0144] 

* 5g hb « $ e, fc , ^ - ft m m m * a ^ -r 5 m ® a a" n ^ *s it 5 m m - r m m m m. m ft 
ftw<omm- Rwmm nit ^ t h^n. us (^Lyg^^ty) , ^^rtfiMfliss, miz: 

M , W8S, pgP/SSgP, 3c®»iS^, BS)gp. Rtf ffffflf&Mm X ftmt 5 £ £ ft 

^/i/T'ti, HtsstKits«nTv^. ^ptcti, ±?3ise (i^tf, ei±{tx« 

7 3-f-yay) , Tftrnm (fflJStJXl±7>^J--tf-->3>) Xti^KfS^ 50 
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tfS $ ft 5 o 
[0145] 

& s i/Mi , & $j x tit # ? # * y * n * n m l t ^ § in us x « *i m * t m m - r m m m 

m.2 tiZX? y — — >y7 , yH2>f*ffl^TH:£c*ftSii»JXtt'h#^?fc?>f#3o 0 l <D 

[0 1 4 6] 

<d * - * ta k> mz> o m & =? m - > « * tc , {k£»fctt-rsiHBJ!S©£StoRi£ 
* © * *p , x « a # *< as it * ^ s & ^ i? ft m <d ta * ?t ? c t # t- t? 5 . pi m k. . m 

'>T5l T * 3 o 

Co 1 47] 

« * » ^ »4 * , si»j-«wafsR«ift«s©siwas{b, ^ k #1 <a »c s s s w ^ ft <n m m t 
«y -t k t w m t* s 5 o « k # 7 a * mRNAo^dasaij-f^WBSiiiae^x aifie 

y d - 7 t L T m ^ 2- c t T # § o ^ 3S % S ti , tK £ , An , X tt « fg ^ * <o 1 U ± 
o?i»l/*f FOlft, ffil«X*4«£©<fc?5S:Sifef*:©ffi#Ja>U H 1ST ;* ;l/ ft * - > 

o^synDNAoni, x«i8*a<o«t5*ae : ?3^-tt<oa{t*^*ft*. « &g 

[0 1 4 8] 

15:3-F{tLT^5lfifiC*5^tiaiSn/cS N Ptlffi^^to A / ^ ^ £ S f* ( 
" i nd e I s" ) ^ttfSNPli, 4 5084551*1/*^ KttlTtlBSnft. Cft 
5.©SNP(CJ:?)gc:§75/ilB?JOf<tti, zL-^--9-;l/ffle^3-K, Rtf02(c 
^^ft§^>-/^KK^J^fflV^T^m(c6SKl-r-SCi:^T"#?>o x ^ V V * -T > h P V , 
RtfO R FOndiJcO^ft^-'ftcOS N POttf ti, ^ft^*ft©S N P<fcDf#£ft3DNA{u 

B, RO : ^ol±«J: , 9ffe.ft§F»1%/^±, x + y hDygi^ffl^TSlK 

ti , (BI3fc^Sft5J:3»c) t h^fef* 1 itv^yjnsyy ifi)c^±t<iILTfc 

0, Ctlli S T S RD'B A C ?7 yf-^Ci ^^^iOlfttJ; o tSJt^nt^i 40 

0 f / L D N A li , EftXttfftP C Riffl^TJBStftftf^Winsc k^fSS. 
RNAXIicDNA^ H9l: LTfflV^ C ttfT* S, $>§^ffl(C*5V^T«> 
O^tBfi. fJ>J * fcf > 7y*-PCR> RACEPCRO^ft, # l> * v — -tfaS^SfS ( 
PCR) (flWJx(i\ U. S. Patent No. 4, 683, 195, & tf 4, 6 

8 3, 2 0 2 #B) , XttlOtCiLT, 'J y-->3 (LCR) ( ®\ & , 

Landegran et a 1 . , Science 241:1077—1080 (1 
988) : ZfclfNakazawa et a 1 . , PNAS 91:360-364 ( 

1 994) # m ) tcfc^T, /D-y/y^^T-otitciiL, Wfcastiae^f 

O^SfiEB©[W)^{C*fflT'$>S(Abravaya et a 1 . . Nucleic Ac 
ids Res. 23:675-682 ( 1 9 9 5 ) #M) . CCJSlCli, ||A>6 50 
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w^yy^i/^iRitsiii:, v>7>i<Dffflj&frz>ifcM ( m * , ? J m R N A X 

8$ g g « , tiiflDNAiliE^^RNA, X(i 7 >^ t > X © D N A E?iJ t y ij 
X -f 3 C iMCtoTfigtg-fSC t A' T* ? 5 o 
[0149] 



3£ 



ffiij RS B? ^ ?8 it * * - > <D m m tc J: K> , 
[0 l 5 0 3 

$ 5 {C , SB 5>J f$ /£ U # If -f A ( U . S. Patent No. 5, 498, 531) fi 

^ £ tc - m t 3 IB 9\ « , ?^P7--tfP^StBft«-*rl¥fiB, X tt It j& o JS t Mc 

[01 5 1 3 

W iS S T* <D ffi JIJ g ft {± , RNa s eRD'S Xtift^BBajSOidftF^UT'- 
4S!ie?i:OE?iJOill4. 11 ffi D N A SB ?U ft? 4fr J: o T St 3£ f S C t ifi T* fr 5 . ff 



* O i Ki ft $ ft BE ?"J ft? W ¥ m f* > IR77*-1' (Naeve 



W . 



) Biotechniques 19:448)© 
, VXX^*h;MCfc5BB?"J«#T ( A WT » P C T 
ication No. WO 94/161011 

ogr. 36: 127-162 
Appl. Biochem. 
9 3) # 88 ) t> # 3: ft & <, 



1 

1 

7 



t 



Chroma 
n e t a 1 . , 
- 1 5 9 (19 
[0152] 



n a - m m & e> , *-»0****tfj-r*fc«>*c, n 

s et a 1 . , Science 230:1242 
t a 1 . , PNAS 85:4397 (1988), 
Meth. Enzymol. 217:286 — 29 

a!o«»o««a»»»K*ittt-r*»js (o r i t 

2766 (1989) ; Cotton et a 1 . , 
: 125-144 (1993) ; and Hayas 
Anal. Tech. Appl. 9:73-79 
E5*A;/£*'J7*'J;l/7 5 F V * T* <D g H f* X fct 
Ey^taai(l*ffl^T7yt^t5 7Sa (My e r 
1 3:495 ( 1 9 8 5 )) tf**h5. jfiKjRKJ 
. 9 tR W * U =T 2 ^ U * ^ K /n -< 7 y ^ 4f - a > 



Mt 4o ^ T Wffl T* £ 0 , 
Internat ion 
Cohen et a 1 . 
( 1 9 9 6 ), and 
Biotechnol. 



(19 9 5 

c ft 5 ic ti 

a 1 Pub 
A d v 



G r i 
3 8 



f f 
1 4 



RNA/RNA, XttRNA/D 
HSI^SfiSa-rSTjffi (My e r 
(1985), Cotton e 
Saleeba et a 1 . , 
5 ( 1 9 9 2 )), ISft^S 
a et a 1 . . PNAS 86: 
Mutat. Res. 285 
hi et a 1 . , Genet. 
( 1 9 9 2 )), StfJttSJOS 

s et a 1 . , Nature 3 
. jg ft HUBS, S(/IiR«y7^7 



[ 0 1 5 3] 



stco^t©ffitcffli/>5c ttftf 5o 0 3 b , * ?e <d ^ sij - ix m m m # > 



10 



20 



30 
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fa ( "i nd e 1 s " ) ^tCSNPtt, 4 5 <D g & 5 7 ? U * 7- K tit B "P fit B £ ft fc . 
[01 54] 

*dA/£*I*«AfflK5fttfffl#»*»W><&«3tU:s & ft ^ W R IK *8 # o i/> T <D SO 10 

[0 1 5 5] 

*ae ; f©aiJfflfcwbTfflffiWK:«:««t5«w*ft, * ft * tc s agsij-ttwmts*:"** 

SO^fiiti:ln : ¥^K*' 5 ft§o 7 > =f- -tr > 7, <D R N AXttD N Aif^f (imR N A i: 

^ £ ft 5 o 

[0 1 5 6] 

85 It , aiR^n^mR N A©l^^M'>tftmR N AO 1 W±OtH«i:fi}|«4? * U> * 
^ K S3 ?U * # A/ ft* U -9* -f A tc £ § Kl S fc M M t T ^5 „ <5J #g & ft g b T & . 3 - K ffi 

*f ft> L ft n - K ffi « i: ft 3 . 
[0157] 

OftJ60^*^-*Iit5t,OT?$5. C <Dfcl£>, ffl * & *. ilffl US fC & , ex v i vo 
T*W»3ftS#fCB3£ft*ifflflStfdi:ft, aAOltrttiAJnTfCFfAOftSOft 30 

[0 1 5 8] 

f;*7 htsttSo iaiosi^-r f -^tc«j:i.i:> * <o mm - \xm m m 2 > ? n it 
, t as (¥L!E£#ty) , so s: wss> mm mm. IS ffi / SK ffi , 

SE J8 » « # , )» ffi , &tfffFfflH§«l»T-fggt-r£>C£:A", fom S - > ZT v y b ft ffi lc £ K) 
SStiTi^, P C R^-Xtf)ffi8Exi"J--yi'*/U/K'« 1 ffif%It« t i: 
ft T i/^*, flJAfcf, +7 Hi, 7^;Htsnft, X«7^;KtRTflE**SBf, Xti^^^W 
^V^^Hti-eJKffl-fiWasSRSIt^rttm-rs C i©T?5^M?:§& > ; -9" ^ y ^ 4> CD m 

cco + >y m* , £ e. ^ ay - ttmm m 2 > * ? s m r n a x « d n a <d m m * V h k I, X 

ffi m f % fc 46 co |K * fs ty c k^ftS. 
[0 1 5 9] 
7 U 

(SEQ ID NO. lfia'3) t/fxJn^ffi^flltl^^ftii^f ©71/^Xttv 
[0 1 6 0] 

CCt'lV^nil^, r7l/-fj X « fT/C^PTU-Tj It, Ifi , ■^--Ya^XtifficO^ 50 
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O 0>J i: L, T , v-Y^D7l/-f li, US Patent 5. 837, 832, Chee 
etal., PCT application W095/1 1995 (Cheeet 

a 1 . ) . Lockhart, D. J. et a 1 . (1996; Nat. Bi 
otech. 14: 1675-1680) RD'Schena, M. et a 1 . ( 
1996; Proc. Natl. Acad. Sci. 93: 10614-1061 

9) ic fa m £ ft £ is m ic l it # o t in m , m s ft , cn^o^nt^mtLxccucm 

•JiASniioi&^^tj;, COi^J/l/^li, Brown et a 1 . , US Pate 
ntNo. 5,807,5 2 2.fCfBSi;£ftS;£i£fc:<fct>$liI£ft3o 10 
[0161] 

T^->D71/^XiiMtli + 7 F f± , » il ti , ^l&O^glft&JiiilO&lSBeMML., 
a«^i-^^7'>'f : •■^:>'XcD^-Uri , 5* * U * K *» , X & c DNA075^^yH)iI%6 
*# s Bte3:*£f*±tC@;£;£ft3 0 *U^?*U*^-Ktt, »»fcttft6~6 0O?*l/* 
^ K ft > <£ 9 «F 31 Ul « 1 5~3 0O?^l/*f Kl, ItWKCfi 2 0 ~ 2 5 7 
f KST*5, fe§^-Y'/cDT-Y^o7'bi'Xti^(±l^ > y HCli, 7 — 2 OO^I/^f 
K fi © * 'J n"? * U * K © # * ffi 5 £ £ ifWmx- & K> m % o 7 n 7 1/^X1^^ + 

•y Hi, 3EP O 5 ' Xli3' E5>J«r$/u^*U=f5t^U*f-F, ifiEMStA/S* 'J d 

? * ^ * ^ k x x a b ju « * (D <8 7g.mm.fr zmiRZ tirc&m%* y =r ? * u*^ f * ^ tf 

t O t' !) i 5 0 ?i'7n7H'X(i^tt*yKc*^Tfflv^n5# l J?7U*fKti 20 
, » £ ? X t4 *t * 4: * § » ? fc ts V> T a W ft * U =f 5* * U * KT$!3i5„ 
[0 1 6 2] 

v *f 9 uT l"( XiitilBi* y F ft *5 V> T , Bl59]<OE5ijO*'Jrr5t^U*^K*HJfi-rSfe 
46 , J*fti:4«ie? ( X « , *»i(cj;!)Sl!BSnftO R F) « , ft §J W fc tt , & ^ 
BE co 5 ' * BH % , X ti 3 ' •FRTtSnviia-ifT/l/d'JXASffl^TltdShS 

. »sw^7;i/i';XAT'(i, a e ? tc t# g w a m s ic m j£ « ft * y v - # isj s $ ft 

, 7* 'J -Y -tr - 2/ a y Ic Jf? jg ft IE H fc G C fig # * ft % , a-Y^U ^^€-->3 VOK 

Sfcft3fc : F»J*ft3-#1flie*8r;feftV'». & 3 * # « * ?-l , 7D7H , Xlitll]*7 
F fC *5 V T , *y:J?*U;*^F©^T&m^5C£##3iST&tM#3o # "J :J ? * U * 
? FO r^7j ti , Jff ft IC It IB ?'J © * * K (4 e L T ^ 3 1 OCJifU*? K*»^Ttt, 30 
|B) — TS5 0 ^7©20g©fJi?7l/tf K ( — JstlZ^ — M) linyhD-yl/tL 

Titans. ^j-yrf^^u^-f-K-^zoiati, 2*510 osora-ess. * y =r v — 

^ X « ffi © $1 , 7^;H, f 77, # ^ X X v f F , X«ffiOl3ftH>g3£»f*:TJ65. 
[0 1 6 3] 

ftfi # , * 'J ^ 5< * U * f- K « , PCT application W 0 9 5 / 2 5 1 1 1 6 
(Ba 1 deschwe i 1 er e t a 1. ) fc 12 Sc 2? ft T V 3 <fc d f£ , Itf *7 7" 
'J yy?g, Stf'fV^>'i7 h77')^-->a yil^ffi^TlfiOSIJ;T'^{K?n 
, Cft5©£T«##£LTCCfc#rt>&i;ft£o fffi <0 «1 *5 T* & , K <y F (x«XD7 h 
) 7 n y F ic®.fc r $ ^ j 7H , t(i < I$->Xf A, ftp & , U V , ««WXttfb*W*S 40 
^ I g £r ffl T , c D N A , Xt4*U=r?*U*^F*S«©aaB±fcEB, 3 
C t t f 5 o ±!2<DJ;?&7 7 1"<{;}:, X t± *'J ffl 5J & & ^ B ( X n -y F7a-y F * 

XttH7h7nyHB) , (ig^ftgl^S^^) , SD'^ttt (a,1-;-yF^B^^Cy 

)^rffll/^Tiyig$ft. S fc , 8. 24, 96, 384, 1 536, 6 l44X«CftJ^± 

CO 1 6 4] 

v-f *P7l/^X(i^tU+7 h * m^T V > 7 >l (Dfttfr ? tattle , £ Va V > 7 <b 
||5nfcRN AXBDNAIi, /N-r7y^*f'-tf-i/a>7P--7*4'tC|^ia$n§„ m R N 
AA'llSft, fLTcDNAtfSSSJtl, Z>^-b>XORNA (aRNA) ^rliSi-r 50 
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2>frtb<D7->7l< - h £ LtfflV^nS. a R N A « ^ ft 14 7 2 \s * =f- K <D # ffi ft T tg 4@ 

^^FtA^7'J?^Xjn5. -O + i^-hafefMi, iE 6S fc ffl «j W - a L T 5 
, X(i§fflSJgOfflfflttT'/N^> r, J^'t'-b"->'3>*^c^«J;9^pgfi$n§o a-C7'J 

if 14 © f§ Jg > RtfS^Ot U rf? * b * ^ Kffi?iJ©fflttW&*£i*£-r&fca6Kl3^£ft 

5. i* $ c ea ^. ib , , ii®, & , w ?s , * © fa ) , *g » « , 

tc ^ T , /\ << 7 V $ << -tf - ~> 3 > © if? & , If # a . R Q* ffl , IrI B# tc ft iffl ^ 5 © lc ffl 

14 i/> o r c , * & At a *a n 14 © w tc ffl i> 6 n -5 , 

[0165] 

«St*:EIJ£ , r*fcii><O#tt*JI0t'rsfc©'T?&4. ft flg tc (i * £ <D J; 9 £ 75 £ « , -rXh 
"i nde 1 s") %ttf S N Pli, 4 5©g&3?£U*^K{iBT > ^ig£ft;teo Cin 

&tfOR Fco^fiyto^n^ncDs n p ©ffii«, ^ n ^ ti © s n pj:0i^n§DNAt 
■ „ &tf*©ttHJ:tM#sn*raa&/»ifc, * * v > rz* > b u > mm* m ^ ? ® m & 

[0 1 6 6] 

A©fi77y^y h^ffli/^tSot, 8Stai4fi5 c fc*«i?*S. c <d £ -5 & 7 

7 -t 'C © i (± > Chard, T, An Introduct ion to Radioi 
mmunoassay and Related Techniques. Elsevie 
r Science Publishers, Amsterdam, The Nethe 
rlands (1986) ; Bullock, G. R. et a 1 . , Techn 
iques in Immunocytochemistry, Academic Pre 
ss, Orlando, FL Vol. 1 (1 982). Vol. 2 (1983 
), Vol. 3 (1985) ; Tijssen, P.. Practice and 40 
Theory of Enzyme Immunoassays: Laboratory 
Techniques in Biochemistry and Molecular B 
iology, Elsevier Science Publishers, Amste 
rdam, The Netherlands ( 1 9 8 5 ) !C K I 5 T ^ l> „ 
[0 1 6 7] 

oTjaicffl^ensf^h^yT'^Ki, t •> -t -r © jb , & tti 7? & © 14 h , & tf z * -t 

*XH:|fflBSaaHJBJ©H««:x aii#K:*5«/^TBJ55J-efet), fflv>5>n5~>X^Ai:S^fa-r§ 
> 7 ;U * f# & c tc J: <3 , S^fciiffl-T -S C ttff * 3 „ 50 
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[0 1 6 8] 

*5%W<Dl&<om £ It it. * fg W © 7 v -tr ■< * fi o fc tb iz & g & U m % ^ ts * -y h Jf « 

[0 1 6 9] 

¥flc*¥£Wli., (a) LLlcfSIJtiSt hy/A©77^7yhi:&^t5i:tOft 
3 1J^±©&^7j-?£$C?fg-©g§S. (b) l«±©»fcH*K*, *S£««*fcli|-f*C 

[0 1 7 0] 

P ffl tc ti , E»Jtift + 7 h i: LT«> U M S'J ^ © £ §1 ic # $ ti T 3 * <y hA'tSti 10 
£><, c©<£5&:gg§£:LTfci:> /h^v^*7XOgg, y^X^v^ggg, © 7 X 

1/ 7 JV £ tt i% & L T PI %k L ft ^ £ r> E . 1 oog»*' 6fflOg»'\i:KJI*iBi*«K 
tr$«, ift?$Ii$li£ (fl»J-**f , U Tr i s -ffl«j«ts) ^^frggg, &tf*S 

*a <o n§ su - ft it s? it e ? ^ is ffi l , c c (c d ^ £ n t m ?ij it $s % ffl ^ t >t §y w e 68 

K -r 3 c £ & V t s $ 5 *«: c n * a * * tc v T Ml ^ © fit iz * ft ft + >y h 88 > W fc #g gt 

7 Icmfr&tS £ ttfT Z Z o 20 
[0171] 

# £5 HHfi * fc , C CtCie® £ tlSil 73- A, 7c ^ ^7 * — £ ffi {J± T * tOTfeS, r ^ 

^7 * - j £: ^ •? ffl ffi «: . ifti'/i/oc #aifi:f4«*# : ?t*&»?, # ^ © H 

XSB AC, PAC, YAC, O R M A C O J; 7 4 Al»fe#tf^ $ tlS. 
[0172] 

[01 73] 

# fg w « , &mft : ?<Dun<Drcib<D^>? $ - (^n--y^*^^^-) , x t± a & # ^ © 

<D tc $> <D Z — (mm^ ? $ -) ^ItttStOt'SS. C © ^ ^7 * - , 
% Iffl X & X $i £ ^ ffl BS , XSfWSSTSltSilti!)^?? ( * h i\/ * i? # - ) „ 
[0174] 

Km** z-*T*&m £ mulcts -stLtc c i sfffflewai5ss««r$*» 

cftfc«fcDli±lfflte4>T©«&#?©ite^#pJftg4:&3o c © « IS # ? fi , $e ¥ tc 15 ffi* £ 
RtftS»»f i:»ISftT, «±IBflSfc«IASft«i:i:*<T?*<5o t fc # o T > ffi — © 40 

« » # ^ a » ^^^-^e.©aK7^^©fe¥*ff9 c i spgf5Mffliifsi^i:^a{tffl-ri. h 
^ l * e » u> < o *> © flaj t? « , a k 73 s =?■ © ^ & v / x a m m it m m is v t> j& c o ^# 

^) o 

[0 1 7 5] 

^ — ^ — , E. col i;fre>01ac> T R Pfitf T A C^P^-^, S V 4 0 (D 
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C T' !i 4 
[0176] 

y V <D £ ? tem^^m^? 5 M&Z'StS $> <DT' & 0 2> o C (Dm £ LTIi, S V 4 0 X. 1/ 
/n>+t, +r^h;*#n<W;l/XcD®#]i8(Dx>/N>'-9- > 4?iJ*-vx> /\>+r, 7 x / •> 
.-OUXXV/n — , U h a ;1/ X L T Rxyny^^t$tl5. 
[0177] 

wm^m^sLxsm^mm^^tsm^icm^x » ss'*** — k ¥ cd fc # cd y * v 

o flS©fg8ifflfi»Jf»fiR#i:l,Ttt» # y 7 f-* x ;Wfc ft *f |b) 8! (c „ ffl & & a* *S lh n K > 

EMtt, lilf, S amb r o o k et a 1 . . Molecular Cloning 
: A Laboratory Manual. 2nd. ed., ColdSprin 

g Harbor Laboratory Press, Cold Spring Harb 

or, NY, ( 1 9 8 9 ) Clfil^nT^^o 
[01 78] 

& ffi cd ?g in ^ y * - & , &g?# ; ?<Dfggitcfflv">3ci:#T#;i> 0 LOi^^^^-tii 

;UX, S V 4 0<OJ:5 M ^7 -> - 7 £ -T /l/ X , 7f yj'f/l'X, *n 

;U X , is a. — F ^ t£ X -W /I/ X , S0'l/Fn^^;M©i?4^.{^Xfi*0^^ 
-^tttlSo ^ ^ £ - « 3: , Cft6cDitBMcDffi*^fc>^fr£|g«t-3C #J 
A fcf , 3X5 K S tf 7 7 — is 5. F?)J;94/7X = K iv^f 'J *7 

y^-S.r/fSil^<y^ — Sam brook et a 1 . , Molecular CI 
oning: A Laboratory Manual. 2nd. ed.. Cold 
Spring Harbor Laboratory Press, Cold Sprin 
g Harbor. NY, ( 1 9 8 9 ) K IE IS £ *l T f -5 „ 

[0179] 30 

n s sa n t* « , 1 &L±.(Dm£.mm<Dffimw%&m (-r*t>-6ffl»«fait) , xtt is, 

[0 1 8 0] 

m m to 7 « , «ft©#fcK:J:oT^**-««rtfci*A£tt3«:i:#-efr*. ii ^ > 
IK it &. a' *g ^ £0 # ns « , Sit icfe^TiaT-feS. 

[0181] 40 

ifflftflil^Jife^A-T^S^^-li, i v J0OSi*ffl^T> J© 5B X « 5§ §1 cd fc «> ic jS 
«fc^±*fflKrt^»A*3Cfc#T?#3./**xy:rttflSK:«* E. c o 1 i , Stre 
ptomyces, Rtf Salmone 1 1 a typhimur i umtfiS3.tl%>tp£ 
niCffi^StlStiD-Ptt*^. M &£.<m ffl m IC it , ^ s Drosophi I a(D<t9% 

tc m tE n 5 t> CD t- « & l/> „ 

[0 1 8 2] 

CC(cie«Jnt^5 J;-5t> B4^£ LTcD^:/^FcD?ggitfS3:Li/'>o L 

o T » *f€Bfl«^7 P f L KcD^fiK^pJ^^Bi^-s^^-^Jiet-r§t >t DT'^.5o Ri ^ ^ * 
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0 «SW5:l^Si^H-J:LT[i, y;^ftys-fig|| (GST) , v ;l/ h - 

^ L ft , pGEX (Smith e t a l . , Gene 67:31-40 (1988 
)), pMAL (New England Biolabs, Beverly. MA), 
p R I T 5 (Pharmacia, Piscataway, N J) tftJtl^^ H ftlC 
HJtStlSfcOTMift^o »a4»«»*IH^E, co 1 i%31^^*-<D0)J£LT«, 10 
pTrc(Amann et a 1 . , Gene 69:301-315 (1988), 
pETlld(Studier et a 1 . , Gene Expression Tec 
hnology: Methods in Enzymology 185:60-89 ( 

1 9 9 0 )) tf^StiSo 
.10 1 8 3] 

ffg^^W^-StGottesman, S. f Gene Expression Tech 

nology: Methods in Enzymology 185, Academi 

c Press, San Diego, California (1990) 119-1 20 

2 8) o * S l/> tt , ttftilftSSB^^OETUfi. W * tf , E. co 1 i 
|g±ffl^Ofcfe(cl5feWtcffiffl$n§3 F>^4§i9tcig^n^ C htftt § (Wa 
da et a 1 . , Nucleic Acids Res. 20:2111-2118 

( 1 9 9 2 ) ) o 
[0184] 

« m ft T fi $ ft , B«t*5^TfffflT58I^*>-t«kD»a2h5i: i: tttSo S 
. c e r ev i s i a eOi 9ft»»n»it5^^-0|!: LTtt, pYe pS e 

cl (Baldari, et a 1 . , EMBO J . 6:229-234 (1987 
)),pMFa(Kurjan et a 1 . , Cell 30:933-943 (198 

2)),pJRY88(Schultz et a 1 . , Gene 54:113-123 30 
(1987)), pYES2 (Invitrogen Corporation, Sa 

n Diego. C A) tfSSh5o 

[0 1 8 5] 

jCtfeTSSo ( W A tf , S f 9 Iffl AS ) *©i'>^*SO»IKfiJffl?n 

5^>?-iCtt, pAc^'J-X (Smith et a 1 . . Mol. Cell Bi 
o 1 . 3:2156-2165 ( 1 9 8 3 )), StfpVLS/'J - X (Luck low 

et a 1 . , Virology 170:31-39 ( 1 9 8 9 )) ^titlSo 

[0186] 

*»(Bo*5«fc^^Ttt, c c tc ta «g * n r ^ m m ft =? » , * ?l s » si ^ * * - & « 40 

^Tfl¥Lao«8Srtt^iJn5o f ?Lil8I^**-OMt LT(i, P CDM8 (Se 
ed, B. Nature 329:840 (1987)), &t/pMT2PC(Kau 
fmanetal., EMB.OJ. 6:187-195 ( 1 9 8 7 )) A^JftS 

o 

[0 1 8 7] 

c c k n ta z ft x ^ s ft m * * - l t a , sBE^^FSraa-rsfcfcfc^fflTa&o, a 

tl 5> &± „ $1 ii ^ Sambrook, J. , Fritsh, E. F. , and Ma 
niatis, T . Molecular Cloning: A Laboratory 50 
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Manual. 2nd, ed. , Cold Spring Harbor Labora 
t o r y , Cold Spring Harbor Laboratory Press, 
Cold Spring Harbor, NY, 1 9 8 9 IC !2i J tit l/> § 0 
[0188] 

^•fe>XRNACD|5¥«:llF^riaSSBa^J^$S^pJ^T*feS^, * fg W . S fc. C CD <fc ? & 

, X«ffi*Wf83i* IIS«W.SfS8i) (CULT, fflilELfc£/*^*-*fc**i5-rs 10 

o 

[0 1 8 9] 

o L/ctfo-c, «±aaBatc«, « & & % m j& , & m £ » us © 

[0 1 9 0] 

^ 3 V . ■< > 7 jl i? ~> a > * V #y x is a Z/ , StfSambrook, et a 1 . 

(Molecular Cloning: A Laboratory Manual. 2 
nd, ed., Cold Spring Harbor Laboratory, Col 
d Spring Harbor Laboratory Press, Cold Spri 
ng Harbor, NY, 1 9 8 9 ) K Ei 2 nt ^ 5 <t 5 S ffi ©gf A^t S tl 5 ^ 

[0191] 

*< , i5ii;fflsaoa*s^^^-4»tc»A^ns<ii:A^T*frSo ra « k , « » # ^ »± , 
«¥jft-ewx*nsjb^ jff a $ ft & frs x it & mft ? ^ ? * - ic !& & z n z c t wc 

£ 3 o 

[0 1 9 2] 

if h 5 > x * ~> 3 > © * »c * £f a x t± * -f m >\< it ■& n it v << fi x t u t w na rt fc 

[0193] 40 

m^m^^t z - <Dmi&w*^tsMffe (D$Kftm mm (omnzt aim t-t 

* - 1*3 *v X t± SU © ^ . * * - * fc # S n 5 c t T- # 5 o v - * - fc it , @ ft £ % ft £ to 

us <o fc #> © -r h^tr-r^ >j ^xtiT^tfv"; y-fistae 1 ?, Rtfna^»ig±iiffliia«ofc 
[0 1 94] 

fiStSSS * > * jS U7, B£S, 1 ¥1 3i <D Sffl H§ ^ &tffft«DWIfl&K:*5^T\ iS <?J & lH 
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& fc #> tc m ^ & c t & X- 1 & o 

CO 1 9 5] 

Jh5. ft BE 5>J tt » CtifiO^^^ KKrtfPftSfr, X (i ^ :/ ^ K (c # IS [W| T' & 
[0 1 9 6] 

, SK >S M « , ffi g ft & 3L « ft ft* #»KSt^©«*PW*W*»fl5fcJ:-3T, ?S ± ffl 

ssA^fSt^n^c ttfT-tSo ^ -? f- k « , m.mr y * - •> a tt b , s$ m . 

Sf^ffl^aThy^7-r, 7"7-<-T-'C^Dvh^77'C, fc K n * -> /I/ ;r /< * h * n 

<0)fIt§y77j£fc<fc-pT, HI iR x »BSft5Ci*iti5. 
CO 1 9 7] 

k « m % <o if y n z/ )\> it /<* - v n % , m m k {& # l t * f- y Trt-e«ifi*nsis 
fr-r-Sjieo^mi: u, ^<o^<oJ«^T*g^ci£§fp£ftfc;*^:*:^:y£#t?t><o-e& 

CO 1 9 8] 

C C (c!BI3nt^5^7'f K £ fg 11 U T ^ ?> ffi 2* * ft ± Iffl l& fc (£ , «>5rcOf£fl!ffl&tf 

^ \m - ffl m m $ > m x « 7 5 ^ * > i-^^sitistsfcftic, s e. ic m m * & -5 



10 



20 



CO 1 9 9] 

T&±«flS«, SSiJ - ft WB*^ * V/< * S , XttJUW- 
f£ll <0 & cO ©llf? (C ^ T T* £> 3 o £<Dfcit>, 3k.%$<D 

C 0 2 0 0 ] 

?& ± an fla « s fc , «mwa»»*s^«3B*j - 

a»»cs*sn*ja« ( «y * « , & fti * fs it , x « pi « ) 
t* s s o 
c 0 2 0 1 ] 

it £ ? W fc X ft 5 n It m ±M BS fi > SSfCk hW^Olg 

icffl^sLttft'f 5, s g ^ & * g§ * w % « , » ii tc « 

O <Q 8S # ffl * SI A S e ? Z # A, fc . =7 >y hXli^OXOi 
iie^ti, J«*'4'<Oite ; ?IB*«l*»i*iOllfflJ!aoyyAH: 
(4 ffl Ml «c i/> T , fig L tc Wi VO <0 f y A 4" ic & ff ? 5 n 4 

C 0 2 0 2 ] 



(C ^ M # t5 T MI ftl £0 fti2 <D 

n * <e ji x it m m f ^» ft 



^Sf*^r[p|S-r?»fca6 



m%mr* & k> . m x. a , iki 
ffl^ii^ti, iJ-^±coiiss§yx 

1% © ffi co fjij t b T f± , t n^n 



30 



40 



50 
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c J: ^ T" f 5 o 
[ 0 2 0 3 ] 

B fig -r § c t tt t- t % o ■< ^ h a y be aj a i>* # y t f - ^ ft ff *< , f t- tc # £ ti t i> & 

[ 0 2 0 4 ] 10 

. S. Patent Nos. 4, 736. 8 6 6 v R L>* 4. 870, 009, by 
Leder et a 1 . , U. S. Patent No. 4, 873, 191 by W 
agner et a 1 . and in Hogan, B., Manipulating 
the Mouse Embryo, (Cold Spring Harbor Labor 
atory Press, Cold Spring Harbor, N. Y. , 1986 

) tc mm ^ nr i/^s o * it. m&oft&tf. i& <o m & ? & & » z. w & <d & m <d & & k m 

[ 0 2 0 5 ] 

t± , '^f ';*77--;P 1 O c r e/1 oxPiJnyfcft- if <>Xf AT'*3. c r e 

/ 1 o x P'J 3>lft-'tf->^f AlCO^TOffilli, L a k s o e t a 1 . 30 

PNAS 89 : 6232-6236 (1 992) #1, Uayt:t--|fi'XfAiDt 
•5 — O <D 0IJ ti , S. ce r e v i s i a e OF L P 'J ayift-if J/7fi?ft8 (O 
'Gorman et a 1 . Science 251:1351 — 1355 (1991 

) o c r e/1 o x P 'J 3>i;t--g->^fi l « , lfl*«AIfi?oaS(0«ffitffl^6n 
■5 t§ ti , Id 133 fc </> T , C r e >; ny t*t--tfSa*I^jnft^ y^^I©S^5;3- 

&\ — * « a tR * n # -y /< i n * n - f it t /c a * si tl m e ^ % & > isaunyf 

[ 0 2 0 6 ] 40 

cctcie«^n^ichj-xnoae : ?m^^x.»^cD^n->'{i, £ rc , wi imut. 

I. et a 1 . Nature 385:810-813 ( 1 9 9 7 ) Stf, PCT I 
nternational Publication Nos. WO 97/07668 
and WO 9 7 / O 7 6 6 9 (C teiJ^T 1,^^ S (CSEo t 4 1 ? tl5 C c I* t 5 . 

m m ire m % t , agfi^MAiitt^&oflei, m x \i f* m m « , funt^ 
ttwustciK^snscfctf-ess. s m s n /c m ii as « , m-gmxitwmifeicftm? 
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C 0 

c c 

i n 

*^ 



§ fc 

c o 

n r 

fit 

so 
to o 

[SB 



2 0 7 ] 

ic ffi ft £ ft T </> 5 ^ 7 K*a^^Sffl*«*fflBa«:^/u«5ie : ?ffl*«*»fttt. 
v i v o0i«T\ CC(Cl2«Lrccfeo^7 T ^^^^:tT'5/ci6(C*fflT^^o L fc 

it^5#l<0ti^W77^-li, i n v i t r o <D MBM^ X (i «BJJS^ <D 7 
^TiiWe)*M:46S^^t>Ln4^o C <D fc £> , Ctittt, g M ffi 5 flF ffl . ^ — 

, in v i v ot'77-b'i't^/ci6(0, fch«fl©»£?*B*»*»14^Jf#"r 
6(c*fflt^§o £fc, HHW»cXtt^±K-oW±(D38aiJ-«W»jR*>^^«a 

nul I aSOKfSJfltSCkfcRlffiTJ&So 

2 0 8 ] 

«B • fc 43 ^ T . ± ic IB « £ ft fc ± X (O W ft «0 & » ?F « , C c ic # # i: L T ffi 0 ii * 

IB tt £ftfc7? SRtfv'Xf- 2*<D»«fcIERtfXJgfi. *f8W<D(EHR 

M £ B8 » L T IE i£ <* ft T ^ 5 , W I*F IS ^ <0 H tc IB «c 5 ft fc 5£ W « , c <D £ d & fip 

SEQUENCE LISTING 



10 



20 



<110> PE CORPORATION (NY) 



<120> ISOLATED HUMAN DRUG-METABOLIZING 
PROTEINS, NUCLEIC ACID MOLECULES ENCODING HUMAN 
DRUG-METABOLIZING PROTEINS, 
AND USES THEREOF 
<130> CL000897 

<140> PCT/US01/42528 
<141> 2001-10-05 



30 



<150> 60/241,745 

<151> 2000-10-20 

<150> 09/739,456 

<151> 2000-12-19 



<150> 09/818,647 
<151> 2001-03-28 



40 



<150> 09/852,067 
<151> 2001-05-10 



<160> 4 



<170> FastSEQ for Windows Version 4.0 



(42) 



JP 2004-531207 A 2004.10.14 



<210> 1 

<211> 2327 

<212> DNA 

(213) Homo sapiens 



<400) 1 

cgcgcctgcc tcctctcccc aggcctgagc 
cgagtcagaa gcttcgcgag ggcccagaga 
ccgggcggga gaaagcccac cctctcccgc 
cgcagagcca tggaat tctc ctggctggag 
ttcgtgttct gcctggccct ggggctgctg 
cggctgctgc gggacctgcg ccccttccca 
cagaagttta ttcaggatga taacatggag 
cgtgccttcc ctttctggat tgggcccttt 
tatgcaaaga cacttctgag cagaacagat 
cctccacttc ttggaaaagg actagcggct 
cgcctactaa ctcctggatt ccattttaac 
cattctgtga aaatgatgct ggataagtgg 
gtggaggtct atgagcacat caactcgatg 
agcaaggaga ccaactgcca gacaaacagc 
gaactcagca aaatcatatt tcaccgcttg 
ttcaaactca gccctcaggg ctaccgcttc 
acagatacaa taatccagga aagaaagaaa 
actccgaaga ggaagtacca ggattttctg 
ggtagcagct tctcagatat tgatgtacac 
catgacacct tggcagcaag catctcctgg 
catcaagaga gatgccggga ggaggtcagg 
tgggaccagc tgggtgagat gtcgtacacc 



tgcccctccc actgcctttc cttcttcccg 60 
ggcggtgggg tgggcgaccc tacgccagct 120 
gccccaggaa accgccggcg ttcggcgctg 180 
acgcgctggg cgcggccctt ttacctggcg 240 
caggccatta agctgtacct gcggaggcag 300 
gcgcccccca cccactggtt ccttgggcac 360 
aagcttgagg aaattattga aaaataccct 420 
caggcatttt tctgtatcta tgacccagac 480 
cccaagtccc ggtacctgca gaaattctca 540 
ctagacggac ccaagtggtt ccagcatcgt 600 
atcctgaaag catacattga ggtgatggct 660 
gagaagattt gcagcactca ggacacaagc 720 
tctctggata taatcatgaa atgcgctttc 780 
acccatgatc cttatgcaaa agccatattt 840 
tacagtttgt tgtatcacag tgacataatt 900 
cagaagttaa gccgagtgtt gaatcagtac 960 
tccctccagg ctggggtaaa gcaggataac 1020 
gatattgtcc tttctgccaa ggatgaaagt 1080 
tctgaagtga gcacattcct gttggcagga 1140 
atcctttact gcctggctct gaaccctgag 1200 
ggcatcctgg gggatgggtc ttctatcact 1260 
acaatgtgca tcaaggagac gtgccgattg 1320 
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attcctgcag tcccgtccat ttccagagat ctcagcaagc cacttacctt cccagatgga 1380 
tgcacattgc ctgcagggat caccgtggtt cttagtattt ggggtcttca ccacaaccct 1440 
gctgctgtct ggaaaaaccc aaaggtcttt gaccccttga ggttctctca ggagaattct 1500 
gatcagagac acccctatgc ctacttacca ttctcagctg gatcaaggaa ctgcattggg 1560 
caggagtttg ccatgattga gttaaaggta accattgcct tgattctgct ccacttcaga 1620 
gtgactccag accccaccag gcctcttact ttccccaacc attttatcct caagcccaag 1680 
aatgggatgt atttgcacct gaagaaactc tctgaatgtt agatctcagg gtacaatgat 1740 
taaacgtact ttgtttttcg aagttaaatt tacagctaat gatccaagca gatagaaagg 1800 
gatcaatgta tggtgggagg attggaggtt ggtgggatag gggtctctgt gaagagatcc 1860 
aaaatcattt ctaggtacac agtgtgtcag ctagatctgt ttctatataa ctttgggaga 1920 
ttttcagatc ttttctgtta aactttcact actattaatg ctgtatacac caatagactt 1980 
tcatatattt tctgttgttt ttaaaatagt tttcagaatt atgcaagtaa taagtgcatg 2040 
tatgctcact gtcaaaaatt cccaacacta gaaaatcatg tagaataaaa attttaaatc 2100 
tcacttcact tagccgacat tccatgccct gaccaatcct actgcttttc ctaaaaacag 2160 
aataatttgg tgtgcattct ttcagacttt ttcctataca ttttatatgt agaaatgtag 2220 
caatgtattt gtatagatgt gatcattcct atattgttat tgattttttt cacttaataa 2280 
aaattcacct tattccttaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 2327 

<210> 2 
<211> 510 

(212) PRT 

(213) Homo sapiens 
<400> 2 

Met Glu Phe Ser Trp Leu Glti Thr Arg Trp Ala Arg Pro Phe Tyr Leu 

15 10 15 

Ala Phe Val Phe Cys Leu Ala Leu Gly Leu Leu Gin Ala lie Lys Leu 

20 25 30 

Tyr Leu Arg Arg Gin Arg Leu Leu Arg Asp Leu Arg Pro Phe Pro Ala 
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35 40 45 

Pro Pro Tbr His Trp Pbe Leu Gly His Glo Lys Pbe He Gin Asp Asp 

50 55 60 

Asd Met Glu Lys Leu Glu Glu He He Glu Lys Tyr Pro Arg Ala Pbe 
65 70 75 80 

Pro Pbe Trp He Gly Pro Phe Gin Ala Pbe Pbe Cys He Tyr Asp Pro 

85 90 95 

Asp Tyr Ala Lys Tbr Leu Leu Ser Arg Tbr Asp Pro Lys Ser Arg Tyr 

100 105 110 

Leu Gin Lys Pbe Ser Pro Pro Leu Leu Gly Lys Gly Leu Ala Ala Leu 

115 120 125 

Asp Gly Pro Lys Trp Pbe Gin His Arg Arg Leu Leu Tbr Pro Gly Phe 

130 135 140 

His Phe Asn He Leu Lys Ala Tyr He Glu Val Met Ala His Ser Val 
145 150 155 160 

Lys Met Met Leu Asp Lys Trp Glu Lys He Cys Ser Tbr Gin Asp Tbr 

165 170 175 

Ser Val Glu Val Tyr Glu His He Asn Ser Met Ser Leu Asp He He 

180 185 190 

Met Lys Cys Ala Phe Ser Lys Glu Tbr Asn Cys Gin Tbr Asn Ser Tbr 

195 200 205 

His Asp Pro Tyr Ala Lys Ala He Pbe Glu Leu Ser Lys He He Phe 

210 215 220 

His Arg Leu Tyr Ser Leu Leu Tyr His Ser Asp He He Pbe Lys Leu 
225 230 235 240 

Ser Pro Gin Gly Tyr Arg Phe Gin Lys Leu Ser Arg Val Leu Asn Gin 

245 250 255 

Tyr Thr Asp Tbr He He Gin Glu Arg Lys Lys Ser Leu Gin Ala Gly 
260 265 270 
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Val Lys Gin Asp Asn Thr Pro Lys Arg Lys Tyr Gin Asp Phe Leu Asp 

275 280 285 . 

lie Val Leu Ser Ala Lys Asp Glu Ser Gly Ser Ser Phe Ser Asp lie 

290 295 300 

Asp Val His Ser Glu Val Ser Thr Phe Leu Leu Ala Gly His Asp Thr 
305 310 315 320 

Leu Ala Ala Ser lie Ser Trp He Leu Tyr Cys Leu Ala Leu Asn Pro 

325 330 335 

Glu His Gin Glu Arg Cys Arg Glu Glu Val Arg Gly He Leu Gly Asp 

340 345 350 

Gly Ser Ser He Thr Trp Asp Gin Leu Gly Glu Met Ser Tyr Thr Thr 

355 360 365 

Met Cys lie Lys Glu Thr Cys Arg Leu He Pro Ala Val Pro Ser He 

370 375 380 

Ser Arg Asp Leu Ser Lys Pro Leu Thr Phe Pro Asp Gly Cys Thr Leu 
385 390 395 400 

Pro Ala Gly He Thr Val Val Leu Ser He Trp Gly Leu His His Asn 

405 410 415 

Pro Ala Ala Val Trp Lys Asn Pro Lys Val Phe Asp Pro Leu Arg Phe 

420 425 430 

Ser Gin Glu Asn Ser Asp Gin Arg His Pro Tyr Ala Tyr Leu Pro Phe 

435 440 445 

Ser Ala Gly Ser Arg Asn Cys He Gly Gin Glu Phe Ala Met lie Glu 

450 455 460 

Leu Lys Val Thr He Ala Leu He Leu Leu His Phe Arg Val Thr Pro 
465 470 475 480 

Asp Pro Thr Arg Pro Leu Thr Phe Pro Asn His Phe He Leu Lys Pro 

485 490 495 

Lys Asn Gly Met Tyr Leu His Leu Lys Lys Leu Ser Glu Cys 
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500 505 510 

<210> 3 
<211> 31208 
<212> DNA 

<213> Homo sapiens 
<220> 

(221) oisc.feature 

<222> 4027-4221; 8990-9744; and 12103-13769 
<223> n = A.T.C or G 

<400> 3 

ccagcctctc ttaggctcct aaatatagtg caaaaagttc cagagttcct ttgttaccca 60 
tgaaagcaca tggaacggtg ctggacaggg gcaactggcc ctggagcaga ggagtaactg 120 
catagaactg tccaagcctc agagggagtc acaccaccag caagaacctg ggtgggagta 180 
ggtgagccaa ggggttccca ggctctgacc ctgccaagag aactcattag aaggtcacca 240 
accacacata ctattcctcg gtctcatgaa gaacccaggg accggaccag gcaagatatc 300 
acaaagctga agtttcagct ctggggcaga gcatggatct gaggtctttg gccctaccac 360 
catgcgatca tatgagggcc atcatacaac catcatgatt tgggggagga atagggcata 420 
gaggaatcat atgaaaagct gaaatgccat gagttaccca gaagaagctg tgtaagccag 480 
aggattctga gaccctgtca aataacaaca tctagttgaa ggttggagtt aggtaggagg 540 
tagggaagtc tgggaaagaa ggagctgaaa cacttgctgt gtgtggctta atggaacatg 600 
caaggggcca ggacgaactt ggtccagatg aagtcaccac cccctggggc ctgtcttttt 660 
tttttttttt tttttttttt tgagacggag tctcactctg tcaccaggct ggagtgcagt 720 
ggcgcgatct cggctcactg caatctttgc ctctcgggtt caagcgattc tcctgcctca 780 
gcctcctgag tagctgggat tacaggcgcg cgccaccacg cccagctaat tttagtactg 840 
ttagtagaga tggggtttca ccatcttggc caggatggtc ttgatccctt gacctcgtga 900 
tccgcccgcc tcggcctccc aaattgctgg gattacaggc gtgagccacc gcgcccggcc 960 
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ccctggagcc tglcttaatc acttacccgc caaataaaat ctggctccag agagtggagc 1020 
gtaggcttaa ggaattgggg gcggaagggc ggggaaggtg ggggagggac agtgataggg 1080 
agaacaggga atlgtagcag aaattgggtt tattgttcag agctgtcaat gaacacttaa 1140 
catatgcctg tcttagccta aatcaatgaa taaatgaatg aataaataaa tgaatgaaat 1200 
gtgggcaatg cctataaaga ttgctgggac agggaggtgg ggggagacac cagcttggga 1260 
agtcaggcct gttagatcct agttcaccac ctgatacgtt acaaatacta aaaccatcac 1320 
tttcaaatta tttttactac attttcctgt tatctgtact cgagtttatt tatgtttctg 1380 
gcatctagag tcagcccttc atgggcatga gacccaagca gccacacgag gctctgaacc 1440 
cagaagagca tatgctcggt ttaatggtct gtcatcttag aattgttaat aaagttttta 1500 
tcccgcattt tcattttgca ctgagattca taaattatat agcaggccct gactgtacct 1560 
gtatagtgga attactatat gatggtacgc tactgtgcat atcttccccg ttcagtgttc 1620 
agtgccctcg tatcggcagc ttgaactagc tcatggtaca cgctgggaat cagggtggga 1680 
atcagttgta aaccatttac cggaacacca ctaggcaggc cacaggataa aggaataatg 1740 
atggtacacc tccccctacc tctaccacct gggaattttg gtagaatgcc agaatggaaa 1800 
agaaaatctc ttgcatagcc atttataatt tgtgataagg aagaaaaaca atgacctcag 1860 
ctttagcatt attttacaat ataaattcag atcccgtgac tgaaaactgt tggacttaaa 1920 
agaggacgct ccaggagcgc aaaagcagtt gggccgaacg aagcgtgcgc gctttggtaa 1980 
ccggctagaa atcccgcacg cgcgcctgcc tcctctcccc aggcctgagc tgcccctccc 2040 
actgcctttc cttcttcccg cgagtcagaa gcttcgcgag ggcccagaga ggcggtgggg 2100 
gtgggcgacc ctacgccagc tccgggcggg agaaagccca ccctctcccg cgccccatga 2160 
aaccgccggc gttcggcgct gcgcagagcc atggaattct cctggctgga gacgcgctgg 2220 
gcgcggccct tttacctggc gttcgtgttc tgcctggccc tggggctgct gcaggccatt 2280 
aagctgtacc tgcggaggca gcggctgctg cgggacctgc gccccttccc agcgcccccc 2340 
acccactggt tccttgggca ccagaaggta aatggaaggg aaaaaggnta gaaaaggagg 2400 
aagagggggg cggaggagga tgcggcagag gagcccagcc ggcagagaga cgcagctttc 2460 
ttccatccct ggggaccctc cggcttgcac cggcctttcc agcccggcct gtggctctta 2520 
gcatcatttt tccttgctct ggagaattgc tttcccgcag ccccacaggg aaaggtcaca 2580 
aaagaggaag ctttgggggc tgggagagag ctatttaaag aacctgaata tggaaaaaga 2640 
aagcgagctg taactcaagt ctgtctctca ttgcttcacc aagccttcca catgtgttgc 2700 
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tttaaaaata gcatgttatt ctaaataact 
cccaatcgtt ggcaccctta gtccatttta 
ttgtgaagta aggagcagcc ccagccagcc 
taaagggaga ctgttagctt ttggtctctc 
taaggttttt attcattcaa ccgactctga 
caaagagaag ctaagtccct cccctgcacc 
agagaaaatg aaaatttaag gcaatgggtg 
tctgtcggag gaaagtatac atctccgcct 
agcagagtct taaaggatgg ttgggtggtg 
cgatcctttg gtttccccac tttctagtct 
tttatcggtt tcttctggta tttaaatact 
ctattaattt aataagttta gacatctgct 
caagcctcat gttgaaattt gattcccaat 
gggtcattgg gatggatccc tcatgaatgt 
cactctctta gtccctcttc aacccccaga 
tcccctctct cttcctgtct ctcaccatgt 
tccactatga gtggaagcag tctgagatcc 
cttgtacagc ctgcagaatt gtaacccaaa 
gtattccttt acagcaacac aaatgtacta 
acaggcaatc acttacactt catattccac 
ttaaatagaa aaacttctat ttgtattatt 
ctaaatggtc ctctttcatt ttatttcctt 
gtattgnnnn nnnnnnnnnn nnnnnnnnnn 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
nnnnnnnnnn nnnnnnnnnn ntgttatgta 
tgttgtttta attgaattgt tttggcatcc 
aggtctattt ctgagtcttc aattctaatc 
gacacagaga gtagaaggat ggttaccaaa 



tattagttgc agaaaatatg caaaatctat 2760 
acaagagaaa attttctttt cctaagattc 2820 
actcgagaaa tactgattga tggaaatttg 2880 
ccgtttttta aatccactcc cacccctaat 2940 
gtggcaattg tgtgataggt actaagatta 3000 
acccaagtca ggtgcagact taggccacag 3060 
ctttactaga ggcctagaga caagggaata 3120 
agagaaggaa ggaaagtctg tgaagggctg 3180 
tggggaaggc attccagcag agctactaca 3240 
ttcttatata aagcaaccac tttcaactct 3300 
tatttgtaaa atagtattac catattgcat 3360 
gtggtttaga tatggtttgt tcgtccccac 3420 
gttggaggtg ggatctgatg ggagatcttt 3480 
cttggtgcag ctgtctcctt cataagttct 3540 
actgattgtt gaaaagagcc tgccacctcc 3600 
ggtctctgca cacaactgct cctgttcact 3660 
tccgcagatg cagatgccaa tgccatgctt 3720 
taatcctctt tgtgaatgac ccagcctcag 3780 
agacaacatc cacctatgaa cttctttatg 3840 
tgtcccagta actatatagt attgtatttt 3900 
tttattatgc aaatgttatt tactgctgat 3960 
ttctcataga actttttccc cacccccaca 4020 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4080 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4140 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 4200 
tctctactgt ctcatgaata ctatgtcgtc 4260 
ttgtcaaaaa tcaattgacc ataaatgtca 4320 
cattgatcta tatgtctatc ctaactcatg 4380 
ggctgggaag gatagagggg agctggggga 4440 
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ggaggtaggg aaggttaatg ggtacaaaaa aaatagaaag aatgaataac acctactalt 4500 
tgatagcata gcagggtggc tatagtcaat aataactgta cacttttaaa taaagagtgt 4560 
aataggattg tttgcaactc aatggataaa tgcttgaggg gatgggtacc ccattcttca 4620 
tgatgtgcct atttcacatt gcatgcctgt atcaaaaaca tctcatttac tccataaata 4680 
tatacaccta ctatgtatcc acaagtatta aaaattataa ataaataaat tatatagcta 4740 
tccttatgct agtaccacac tgccttactg ttgctttgta gtaagctttg aaatcaggaa 4800 
gtatgagtcc cccgcacttt ggtattttcc aagaltattt tggctgtttg gaatccttga 4860 
tttctataca aattttagac tcagcctatc aatttctaca aggaaaccag ctagggttct 4920 
gcttgggatt gcactgaatc tgtagatcag tttggggatt attgccatct taagaatatt 4980 
aggtcttctg atccatgaac acagaaagcc tttccgttta gttaggtcat ctttaatttt 5040 
ttttgttgtt tttttttgtt ttttgagaca gagtcctgct ctgtcgccca ggctggagtg 5100 
cagtgacgca atctcggctc actgcaacct ccgcctctcg gattcaagcg attctcctgc 5160 
ctcagcctcc caagcagctg ggactacagg cacatgccac cacaccaact aatttttgta 5220 
ttttcagtag agacggggtt tcaccatatt ggccaggcta gtctcgaact cctgacctcg 5280 
tgatccaccc gcctcaccct cccaaagtgc tgggattaca ggcgtgagcc accactcccg 5340 
gctttcttta atttt tttta acgatgtttt tgtatttttc aaagtataca tcttgcattt 5400 
cttttgttaa atttatttgt tttgttcttt ttaatttcat ttcagactat ttattgcatt 5460 
catagtgttt tagagtccac attccctctt gactgtcact aagttttttt ttttctgttt 5520 
ttgagaggtt tctatcagaa ttttgcagat cagagatgac ggacatgtca aactgtctaa 5580 
tattaccaac cctccccatt tatcagatca ggatcctttt ggtgattcac catgcaggga 5640 

aatctagtat ctaaggctca aaaggtgata ctgttttaca taggcagtaa cattttattg 5700 20 

ctacataata actacatatt tatggagtac ctgtgatatt ttgatacgtg catacaatgt 5760 

gcagtgatca aatcagggtg tttagggtat tcatcacttc taacatttat tatttatttg 5820 

tgtttggaac atttcaagtc tcttcaagct cttcagaaat attcaataca ttattgttaa 5880 

cagtgctatt gaacactgga acttattcct tctatctaaa gacagtaaca ttttaagtat 5940 

agtcataagg ttacagaagg ataaagtgtg tatagggaaa attccctaca agatgagaat 6000 

ttcattcctt actcttagta atacaggtct tcaaacatgc caaggatatt cctcccttgg 6060 

agctttgaac atgcacgtct gtggttatat tgctctccct gcaaattatt cctaaaagag 6120 

gcttgccctg accattcaga ctaaaatagc acctctagta ctctctatct ccaaccctat 6180 
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tattattatc ttggccctta tcactctctg acactatact gtatactctt ttgcttgttc 6240 
gtttattatc caccactaac tacaatataa aatctgtgag aggtaggatc tttgtttgcc 6300 
actataaacc tagtgcatgg tacagttcct ggtgcataat aggtgctcaa taaatccttt 6360 
gttgaatgca taaatatatt aggtgctgag aaaatttatt tattcaaaga tcaatttact 6420 
gcatagaata ggccaggtgg tttgacattt attcaatagc caacatatgg gacctaggat 6480 
gtacatatgc aagtgtgtgt gtgtatgtgt gtgtgcatct gcatgtgtac ttggatgtac 6540 
tgcagagaac atctatgtag ctaagtagta taaagcactt gggctccaga gttaaactgg 6600 
agtttgaatc ctcattagtg gttgccagct gtacacactt gggcagatca tttaacctag 6660 
tctgtagggc tcaatttcct catctctaaa gtagggattg taatcatatc tacttcatag 6720 
ggttcttgat gtaaatatU aataacatag aacatggaaa gcatttagca gcacctagtt 6780 
catagcagtg cttgataaat gttcgctgtt gctatttggg ggcactatgc attttctgaa 6840 
catttctgaa caatgtttac taaatatatg tagtacccgt tttcaagtgt atttagatgc 6900 
ttctctgggg atgaagaaat ataaattaaa tatagtacag tattcacaac agttttctgt 6960 
cctttttgtc tagtcaggag ttacaaaaag tataatgaaa tactttcata tggctggggt 7020 
gtttatgaaa attttttacc taaacaaaca attgtcatat tagtttacaa tattcatgag 7080 
ggcaaaggcc ttgtcttcct tatatttctc tgtatctcta ccacctggta cgtgtgatag 7140 
acaataaata cttgtgtgtt tattgtttgt aaatgaataa atgaaaaaat attcacattg 7200 
ttgaaaacca ctactctgga tagtcagtgg gtgcttatca ctggcttgat tatggcaaca 7260 
ttaacaaaaa agtgcagtat tttagaaact aggtttcaag actctcaacc tttcagtggc 7320 
cttgaactat ccagagaaca ctttatgggt taaaattgct aaatgataac agagaaaaat 7380 
gggagccaga gttgtccacc tctccagagg atgagagcaa acaatcctgc agcagatacc 7440 
gtgtgattgg tcacacgagg aaaaatctgg cagccttaag attactttgc agcgggggac 7500 
tcccaccatc atgctcaagt gtgtagatgg gcacaccaaa acacacacat gcaggtgccc 7560 
tccactttac acaagaagca aatgtaaatg aatcttgttt tcagtgattt agagaaacaa 7620 
tttaagtgag ccattactca tctgcttcta aaagcaaaaa ctccttctct ggtggtagta 7680 
tttgcactct catttgtaaa tgttggaagc tgaaagtttt gtatttgagt ttgctttaag 7740 
attcacacat ctgtgtaaat ggaccttctg ttgttggggg gagaatttgg attttcttta 7800 
tagatagagt tggcaatttt ttagagagaa gcatttactg ctaagtcatg agaaataatc 7860 
actggtgcat aattagagag aggaacagga agaagaaatg gtgagctgga tgtagggtca 7920 
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tgccccattt agtaactgtt agtttcccac ataggaaata cttcttttta gcttccagat 7980 
cccactccaa tctgagtgtg tgatgttggc aagtgaggca gagagtgtga ctcggctcac 8040 
cctctattgg gacaagagtt cacagtaaat gtcattcaac agtgacttgg tctgggggta 8100 
caggatatat taatattgag aagataaata cactaacttt gtttagagaa ttatccccca 8160 
agcttagaag tcccaaagaa agcatgttat gtcacttcca gaaaagtctc aggctcctct 8220 
gcttgtgtga ccttatcagg tcctgaactc agcttgtgtc tataagaggg gacaggtcca 8280 
gcttggctgg ctaattactt ttactttttt cactgcagtt tattcaggat gataacatgg 8340 
agaagcttga ggaaattatt gaaaaatacc ctcgtgcctt ccctttctgg attgggccct 8400 
ttcaggcatt tttctgtatc tatgacccag actatgcaaa gacacttctg agcagaacag 8460 

gtaagaagag ggggaaagct ctgggaccta ttcctcctag aagtgaaatg cataaaaccc 8520 ^ 

ataggcaaga ttccaaagca aagattggtt tggggccttt aagagacaca gcagcaagta 8580 

tggggaggtg acaggtttcc taccaatact gaaggggatt cccatatcct ccccagtccc 8640 

ttgtcttgtt caggtatgca tgggcacgtt gaagtcggta taacttaaag cctagctggc 8700 

attaccagac ttgccaggca aggcttccct tggcctctgt gggttttatg acttcagtgt 8760 

cagcaacact tcccactcct acccctggtc tcgagcataa gtctcaagag ggtgggaaat 8820 

cagcagtaac tctacctctg ctggttcagt atgaaagcct gaatgctaga tcattaattt 8880 

acccatcaga cctcttgatn nnnnnnnnnn nDnnnnnnnn nnnnnnnnnn nnnnnniinnn 8940 

nnnnnnnniin nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9000 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9060 

nnnnnnnnnn nnnnnnnnnn nDnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9120 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9180 20 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9240 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9300 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9360 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9420 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9480 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9540 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9600 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 9660 
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DQOQOQOQD0 DDDDOQODDD ODDDDQQDQD DQDnDQDDDD DDDDDDflDQD QOODQODDOD 9720 

onoDDnnnDo Dnnnnnnniin nnnntctgct tgactctgca gatcccaagt cccagtacct 9780 
gcagaaattc tcacctccac ttcttggtat gtatgtgcaa atgagaggta taacccactc 9840 
tcattcaaag tcccctttcc atagtagagc atgccaaaga aactgaaatc tgaattcaaa 9900 
agcacaaaga gtgcaaggta gagctatact gaacgttatc taggggaaag attgaagggg 9960 
agctctaagg tcaacacacc accacttccc agaaagcttc ttcatccgtt tctctcccac 10020 
aaagtcttat tctcaaggca gcagatacat gaatctgtcc cctctctctt taaaactaca 10080 
gccttggcca ggcacagtga ctcatgcatg taatcccagc actttgggag gccaaggtgg 10140 
gaggatcact tgaggtcaag atttcaagac cagctgggcc aacatggtga aatcccatct 10200 
ctactaaaaa tacaaaaatt agccaggcat ggtagcatgt aggcctgtag tcccactact 10260 
tgggaggctg agacatgaga atcgcttgaa cctaggaggt ggaggttgcc gtgagctcag 10320 
attgtgccac tgcactccag actaggtgac agagcaaaac tctgtccgca gcccccaaca 10380 
acaaaaaaaa aactacccaa actgcagtct caccatccct attcttgttt tctttatcct 10440 
tctctcgttt tcttggatgt tttcctttct ttttggagtt cctttatttc cacatgcgag 10500 
tcagtaaaat tttgctctag agtttggcaa tattctgtca gcagataaac taagctcttt 10560 
aattacataa ttggtattta tgttaaacaa gacatgaatg aaagaaaaga atataggctt 10620 
gtattaggaa ccacttaaat ttgaatcttg ccccctcctg cattgactag ttaaatatga 10680 
tcttggggaa gtcatttaat ctctccctat ctcagtttcc tcatctttga caataaggat 10740 
gagactcaca ttgctgggct gttatgagga ttaaatgaaa tacatatttt tagcactaca 10800 
tgtaatggcc accattgtat gagtgacaga tcatgcatca tgagcctgga atgttgtaag 10860 

cattcaatga atggtatcaa ttatgtatta ataaacttta aagtcctttt aaagccaaat 10920 20 

cctaatgacc agtctggcaa tagaagattg tgaagcatta gccttggtaa gtatttccac 10980 

atagtatcat tcatagacct gggctcaagg aggaaatatc aggggacaga gtggacactc 11040 

ttgtctcttt ccttgtgaat ttatgttcat catatagttt atggattggt ttggagtgga 11100 

aaggaattca cttgctctgt tactagtgtg agctagggag taggttggct accttatgta 11160 

ttcactttca gttaacctcc acagcaacac agggaaaaag gtatttagta tcatagttca 11220 

ttattgagaa aagtaaacct caggaagatt gagtcactta ttcagttact acataggtag 11280 

taactggtga tttcaggatt agcgtgctaa tcttataagg ctttgaaatt tattagactt 11340 

tgaaactgtt tctcacaata ttaaatacat ccatcccaga ggtaagcttc taaattcacc 11400 
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nnnnnnnnnn onnnnnnnnn nnnnnnnnnn nnnnnnnnno nnnnnnnnnn nnnnnnnnnn 13200 
nnnnnnnnnn qodddddddd nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13260 

ODOODOOnDO DDDDDDDDDD OI1DI10DDI100 DQOQDQOODD nnoHQDDOQO DDQDDODDDD 13320 
DDDDDOODOO ODDDDODODD DDDDODDDOO DHnnODnODn ODODODOI1DD DDDDODDODO 13380 

nnnnnnnnnn nnnnDnnnnD ddododooqo nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13440 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnDnnnn ddddddddqd 13500 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnDDnn nnnnnnnnnn nnnnnnnnnn 13560 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13620 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13680 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 13740 
nnnnnnnnnn nnnnnnnnnn nnnnnnnnng gtcaggcttt gctgggggca gctccctgca 13800 
acagctcctc tccacacttg ctctgtttct cacttttgaa tccaaacgtt tttgaaaatg 13860 
ttctgagttt attttaaaat gtggctatgg tggttgagag cagtggcagg gtacctagca 13920 
agtttggaat tgaagttgga ggaagccctg gggtaaaccc cttgtaatta tgggtcttgt 13980 
gtcaatgatt gctttaatgg aactctggtc tgtttgaaag cagagttatg gtaataattg 14040 
aaaagccgca gatctttaac tcagccattt accatatatg cagttttctc catgctcctt 14100 
ctcactccgc tgggtgtatt tttcccttcc tcgtgccctg tgtaagcaca tggcttattt 14160 
actcatgtga tctttggttc ctgctgggtc agggttgtct ccattagatc ataaaaacag 14220 
ggccaggcag gagccttcaa atgaaggcaa tttggtcatg gtggtggtga tgatgttggt 14280 
cttgacctcc tgtgccagga taagtgggag aagatttgca gcactcagga cacaagcgtg 14340 

gaggtctatg agcacatcaa ctcgatgtct ctggatataa tcatgaaatg cgctttcagc 14400 20 

aaggagacca actgccagac aaacaggtca gtggtgggag agcaaaaaag atatttcttc 14460 

acattttcta agttgtttat taacacatta tcccaacttt ctcttctagc acccatgatc 14520 

cttatgcaaa agccatattt gaactcagca aaatcatatt tcaccgcttg tacagtttgt 14580 

tgtatcacag tgacataatt ttcaaactca gccctcaggg ctaccgcttc cagaagttaa 14640 

gccgagtgtt gaatcagtac acaggtattt gttgggtttg ggttgcccac gtccatacgc 14700 

tgccatgatt gtactgtgtc tgtctagagg gataaacctt aatatgacaa gagaaagaat 14760 

ctttgttatt aatggagctt ttatatagac actgctccaa agaaatttga cttgagtcct 14820 

ttataagact ttgcttcaac catagcagta ttatcagaat ttttatatat atatatatac 14880 
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actattttta ttatggacaa ttattattaa tacaaatata agtaggcact taagagttcc 14940 
agacatacat ggaatatggc tttttgcaca gcgattgcag taataataat gacaagctaa 15000 
aaacattcat gcaacatagg aatggagagt ggaacagagt aaacatggac atgcacccga 15060 
aagaatattg attcaaaaac agttttagca agcataaaca caaaagttga aatagattaa 15120 
gctttttaag caattcaaca ttacttgtca tgaatgccat aatggagaat acttatcaag 15180 
cagtgaatta atccttcatc agcttcacca cttactagca gttactagta agttacttac 15240 
tgctttgttt cagtgtcatc tataaaatgg agattaaaaa agaacctatc tcatacattt 15300 
gttgttacga tgagtgggtt aatatatata aagcatttag gacagtgcct ggcactgaat 15360 
agatgttaaa tgtaaagtat agttatgtca aatgtctttg cttccaggaa ttttgcaaga 15420 

cacaccaaca tatgcacact tacacataca tatatgcata catgcacata gatattataa 15480 ^ 

agaggacact cagagaagca ggttataaac aatttaaggc ataaatgggc attataaata 15540 

gcagcagttc ccaagtcttt ctgcatcatt gcacacacag aaaatgttaa tgtttttgtg 15600 

cttcattgga gtaaacagga atggatttgg gggaagctat acagaacttt gtaaaaaaaa 15660 

atctttactt tttaaatatt atacaattat gatgaaaaag caaaatgcaa agtgttaggg 15720 

aaaatattaa atgttaaatt tattcaaaac ttaaaacctt ttcaattttt tttttttttt 15780 

ttttttgaga tggagtctct atcactcagg ctggagcgca gtggtgtgat ctcagctcac 15840 

tacaacctcc acctcccagg ttcaggcaat tctcctacct cagccttctg agtagctggg 15900 

attacaggca ctgccaccac acctggctaa tttttttaaa ttgtttattt ttatttagtc 15960 

aaatatatca atattttatt ttattgcatc tggattttta gtaatcacaa aaagccattc 16020 

tctattccag ggtttctcaa ccctcagcac taatggcttc ttagattaga taagtccttg 16080 

ttgtcaagat gtgtgcattg taggatgttt agctacatcc ctgacatcta cccactcgat 16140 20 

gtagtagagc tctgatagtt atagcaacca taaataactc cagacattat tgaatgttcc 16200 

cagggccccc agttgagaac cactgccctg tacccaggtt gtagagaaaa ttatttatgt 16260 

tttcttgtag tacttgtata atttcattat tttcatattt aaatcagaga tctaaactcc 16320 

atttagaatt tattcctata tatggtgtga ggtattgatc taatttttcc aaatgtttat 16380 

ccagttgtcc catcaccatt atttaaaagt ttatcttttc aagtgatttg agataaccat 16440 

cacattctaa acggatacat gtactggtat ctgttttgga taagagtata tttggatgtt 16500 

ctcgtgtatt ccattgatct atctaccaat gtaccagaat cacactgttt taattaagga 16560 

gattttgtgg cttttttcaa cattaataga ccttattttt agaaaagttt taggtttgca 16620 
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gaaaaattca gcagaaagta cagagagttc tcatattacc catgtaacaa acctgtacat 16680 
gtacccctgt atctaaaata aaagttgaaa ttttttaaat agtaaataaa tattacctct 16740 
gttccatatt tttgttttgt tttttttctc tcagctcctt caattataaa tatattggca 16800 
tttctttgcc tgtcttctat ttcattccat tttatttaat aacttttccg tgaagataaa 16860 
atattagact gaggaagaaa agaataattg gtcacttgca tctaaacttg aaatcatctt 16920 
aattttattg cccacatact gatggaaact atgtttttta tttgtgttgt ttatctttgg 16980 
agctttaatc aaaagtccct ttgatgagaa aataaaccat ctgtgaaaat tagatctatt 17040 
taaacgtctg gaaatcaggc aagatttgaa gctattcact aaccatggct tgctttataa 17100 
tttatttgac tttgccatca ctttggtaat tggaaactat ttttctaccc agatacaata 17160 

atccaggaaa gaaagaaatc cctccaggct ggggtaaagc aggataacac tccgaagagg 17220 ^ 

aagtaccagg attttctgga tattgtcctt tctgccaagg taaatcttct aaatttctaa 17280 

gcctgctcaa gtgaccagtt aattatgtaa gtaggtgggt aagtgggaat gggatgggga 17340 

gacaagaata aaaccgattg actaaattta actgtacttt gaattgatga gc age t teat 17400 

gcaatttgag acaaagagag aattctgcaa ctgtgtcgct agaggagggt tagtaaagac 17460 

taaacgaacg atttgacaag atttgaggat tgtcatatgg atacatggat tttagggcat 17520 

catgaaaaaa tggtcacatg gataaaegta aaaattatga tgataaggtc ctgggaaatc 17580 

tgggagtttg aagagaattt etagggectg ttgatcgagg gccctttgtg caaggcctgc 17640 

ttttcttatc taaccttggt tctcctttat gctttgggca gaatatggtt tataccacat 17700 

atttgttgaa ctgaattaaa atttaaaccc ctatttaaag ctctgatttt tcccctcaaa 17760 

tcattattgt ggttgtatct ccaaacattt ataaactggc attttattta aaatatttgt 17820 

attgtacttt ctaggatgaa agtggtagca gcttctcaga tattgatgta cactctgaag 17880 20 

tgagcacatt cctgttggca ggacatgaca ccttggcagc aagcatctcc tggatccttt 17940 

actgcctggc tctgaaccct gagcatcaag agagatgecg ggaggaggtc aggggcatcc 18000 

tgggggatgg gtcttctatc acttggtaag atctgcaccc ctaaattttc ctgctagttt 18060 

tccccctgag attttgettt attttttgeg ctggtacctt agtgacccta gtgectcagg 18120 

atatgtgtag gtgaaacaga agaagtaggc tacttttctg ttctttctaa agagagctcc 18180 

aaattattct cttgtctttc aggaaaaaaa aaaaagttta tttatccata aattgtctgt 18240 

cattggtttt ctaatcaatg gtgtgtgaaa tgtcttattt ctttatttca ccttggctct 18300 

gatgeattgg aaatgaggac ttgatccctg ggctggcact tagaacttaa acaatagggt 18360 
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ceaagtggag ctcctcttct gagagagctg aatgattagc tgcattattt aaggctcatt 18420 
ttagacatct cccagccgct tgtcaccaat tttattcctc aggattgatt ttagacttca 18480 
gacataatat tcgatgatat atactatagt taagtttagc aaatatggac tgaggacatt 18540 
ttaaalactg agactttttt tatgactaca atttattgtg ggccctgtct tcggtgagct 18600 
aatggtctaa tacaggagac aggagacaga cctccaaatt gcagtgtagc ataatgaggg 18660 
caatgataga gatatgtgct ggctaacaca aagacataga agacaggtac ctaccctggc 18720 
atgggagctc aaggagactt ccttgacatt tacgctgact gcaggataag taggagttag 18780 
ccaggtggaa actgtcatct ctatcttgct agactttaag catatactgc tgttaataaa 18840 
gcccaggtta tgctgtttgc aaagataaaa tgtgttcctg acataatact ggtcaaaggg 18900 
acagaaagac agaaatgcta aggacaattc agcagcagac cagataaaaa acaccatatt 18960 
tcatatgcaa aagtcaactc aattgaaaca tttgtaaaac caaatttgac attataaaag 19020 
tatatcagag atctcatttt ataaggaaat agaagccctt tcctaccata aactaaagat 19080 
ttaatctata tagcacaaaa tacaatgttg agtaatcatt tttaatttat tttttaactg 19140 
acaaaaattg tgcatataca tgttatatat atatgtatgt gtgtatatat atatgatgta 19200 
caacatgata ttttgatata tgtatacact gtggaatgac taaatctatc aatggacatg 19260 
ttcattaact catacttatc atttttttgt ggtaaggaca tttaaaatct accctcttag 19320 
caattttcaa gtatacaaat tgttagtaac tccaatcaca tattgtacaa tgcatctcct 19380 
aaacttatgc ctcctgtctg actgaaattt tgtatccttt gactaacatc cctgtaatcc 19440 
cccattctcc cacagcccct ggtaaccact gttctactct ctgcttcttt gagtttaatg 19500 
ttttagattt ccacatgtga gatcatgtgg aatttgtctt tctgtgcctg gcttatttca 19560 

cttagcataa tgtcatccaa attcatctct gttgtcataa atgacaagat atttgtcttt 19620 20 

tctatggcta attgttagtc cattgtttat atatatacca tgttttcttt atccatttat 19680 

ccagtgatgg acacttaagt tgatttctat atctgggcta ttgtgaataa tgctgcaatg 19740 

aacatgggaa tgtagatgtc tcttcaatgc actgatttca tttcgtttgg ttgtatatcc 19800 

agaagtggaa ttgctgcatc atatggtagt tctattttta attttttgag gaaactccgt 19860 

acaattttcc atatggctgt actaatttac attccaacca aaagtgtata agggttctgt 19920 

tttctccaca tcctcaccaa catttgtctt tttggtaata accattctaa tgagcatgag 19980 

gtgatgtctc attatggttt taatttacgt ttccctgatg attagtgatg ttgagcattg 20040 

ttttaaatac ctgctggcca ttcatgtctt ctttgtagga atgttatttt aggtttttct 20100 
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caUUtaaa tctagttatt tgttttcttg cttttgaatt gtgtgagttc ctcatatatt 20160 
ttgaatatta accccttatc agatgtatca tttgcagaca tgttctccca tcctttaagt 20220 
tgtctcttca ctatgttgat tgtttccttt gttgtgcaga agctttttag tttgctgcaa 20280 
aaccatttat ctattttttc ttctgttgac tatacttcca gagttgtatc caaaaaatca 20340 
ttgccaagaa taatatcaag aagcttttct ctatgttttt ttctagtagt tttatagttt 20400 
caggtcatat gtttaaatct ttaatccatt tttagttgat ttttgtatat ggagtgagat 20460 
aaaggtccac ttttattctt ctactagtgc atatccagtt ttctcaacac catttattga 20520 
agatactgcc ctttcaccac tgtatgttac tggaaccttt gtagatcagt tgacaataaa 20580 
tgtgtgggtg tatttctgga ctctttatcc tgttttatta gtttatatgt ctcttttttt 20640 
agaagctcta tgctgttttg gtgactagag ctctgtagtc aatttcagat caggtagtat 20700 
gatgcactcc agctttgctc tttttgctca aaattgcttt ggctatttga gtttttttat 20760 
tccatacgaa ttttagggct tttttttttt ttcgattact gtgaataatg ccattggaat 20820 
tttgatggag attgcattga atctttgggt agtatggata ttttaacagt attaatgctt 20880 
ccaattaatg aacacagggt attttgcaat ttgtgttttc ttcaatttct ttcaccagtg 20940 
tttttttctt aatttaattg ttttatttcc atagggtttg ggtaacaggt ggtgtttggt 21000 
tatgagtaag ttctttagtg gtgatttgtg agattttgat gcacccatca cctaagcagt 21060 
atacactgta cccaatttgt agtcttgtat ccctcacctc cctcccacca tttcccccaa 21120 
gtccccaaag tccattgtat cattcttatg cctttgcatc ctcatagctt agctcccact 21180 
tatgagtgag aacatataat gtttggttct ccatttctga gttacttcat ttagaatatt 21240 
ggtctccaat tccatccaga ttgctgcgaa tgcctttatt ttgttccttt tcatggctga 21300 

gtagtattcc atagtatata catcccacaa tttctttatc cattcttgat tgatgggcat 21360 20 

ttggactggt tccatgtctt tacaattgcg aattgtgctg ctacaaacat gcaggtgcaa 21420 

gtgtcttttt catataatga cttctcttcc tctgggtaga taccctgtag tgggattgct 21480 

ggatcaaatg gtagttctac ttttagttct ttaaggaatc tccacactgt tttccatagt 21540 

ggttgtacta gtttacattc ccaccaacag tgtagaagtg ttccctgttc actgtatcca 21600 

caccatcatc tat tat tat t tgattttttg attatggcca ttcttgcagg agtaaggtgg 21660 

tattgcactg tggttttgat ttgcatttcc ctgatcatta gtgatgttga gcattttttc 21720 

atatatttgt tggccatttg tacatcttct tttgagaatt gtctattcat gtcctttgtc 21780 

cattttttga tgggattatt tgtttttttc ttgctaattt gagttccctg tagattctgg 21840 
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atattagacc tttgttggat gtgtaggttg tgaagatttt ctcccactct ttgggttgtc 21900 
tgtttactct gctgattatt tcttttgctg tgcagaaact ttttagttta attaagtccc 21960 
acctatttat cttttcgttg ttgttgtttt ttggggttgt tttgttttgg cttggttttg 22020 
catctgcttt tgggttcttg gtcatgaagt ctttgcctaa gccaatatct agaagggttt 22080 
ttctgatgtt ctagaatttt tatggttcag gtcttagatt taagtccttg atccatcttg 22140 
agttgatttt tgtataaggt gagagatgag gatccagttt catgcttcta catgtggctt 22200 
gccaattatc ccagtacaat ttgttgaata gggttaatat ttaaagcttt atatatttag 22260 
gtgttcctat tttgggtaca tatttattta caactatcat atcctcctga tggattgacc 22320 
cctttctcat tatataatgg tcttcttgtc tctttttaca gtttttgtct taaagcctaa 22380 
tttgtctgat aaaagttcag ctacctttgc tctcttttgg tttctatttg catggaatat 22440 
ttttttccaa cccttcgcat tcactctatg tgtgttctta aagatgaaat gagatgctgt 22500 
aggggcatat gcttgggtct tgttttattc attcattcag ccaccctttt gattagagaa 22560 
tttaattcat ttgtattcaa ggtaattatt gacagacaag gacttactac tgccattttg 22620 
ttaattgttt tcttgatgtt ttatagatct tttgttcctt tcatcctctc ttactctttt 22680 
cctttgtgat taggtgcttt tctctagtgg tgtactttga tttttacttt ttatcttttg 22740 
ttgctctact ataggttttt gctttgtggt taccatgagg gttacataaa gcatagttat 22800 
aaaaggctat tttaaactga taacagctta actttcaaca cttaaaaaaa ctatacactt 22860 
ttactctacc aactgccctc cattttatgt ctttgatgtc ataatttacc tagttttgga 22920 
gatgtgtccc cttattgtgt atcccttaac aaattattgt agcaacagtc atttttaata 22980 
gttttggctt ttaactttat actagagata gaattaatta acataccacc actacattat 23040 

tagggtattc taaattgact atgtatttac ctttatcagt gagatttttg ttttcaattt 23100 20 

tcatgttgtt aattagtatt ctttcatttc aacttggaga attcacatta gcattttttg 23160 

taagatgggt ctagtagtgg tgaacaccct caacttttgt ttatctggag atgtctttac 23220 

ctctgcttca ttttgaaata taacttttgt tccatgattg aaatggacaa aattgttttt 23280 

ttaattatgc aaagtgccag ggtaagcaga attactcttt tttttttttt ctgagaccga 23340 

gtttcactct tgttgcccag gctggagtgc agtggcgcaa tctctcagct taccgcaacc 23400 

tctgcctccc aggttcaagc gattcttctg cctcagcctt cctgagtagc tgggattaca 23460 

ggcatgcacc accatgctcg gctaattttg catttttagt agagacgggg tttctccatg 23520 

ttggtcaggc tggtcttgaa cacccgacct cagatgatcc gcccacctag gcctcccaaa 23580 
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gtgctgggat tgcaggtgtg agccactgcg cctggccaga attactctta tttatcctga 23640 
gcttgaggaa gaaagaattc aaaattaaaa tttcacatta cctaatggcc aaagcctgca 23700 
ttcaaaataa gtaatcagaa aaacatataa aaacacaata agataaacag actaaatata 23760 
tgcagtcatt ttatggaacc aatctgacta gattggatgc agactaggta ggatgcaaat 23820 
ttaaaaaaaa ctttattctt cttccactta taaactttaa acctgctttg tggagcaagt 23880 
tctttttatc tctggggaaa gatcctgagt aagtctcata gagttctcat tcatttaaat 23940 
cacaagaaca atcttaggtc agtaattaaa ctatctggcc cagtgtaata ctgaaacttt 24000 
caaatactta tccacttgag ctcttctttc catcccagct tggtacttct ttggtcctag 24060 
aagccagcag tggtttatca tcgacttatt cttactgact agctccccaa tacccagtag 24120 
ctgctgtttc tggcccctcc aggaatggtt ttaggaggaa aggggataag gagtaaaggg 24180 
ctggtactat tgtgatcatg ccaaagggct tggtggatat tccatgcttc cctttctctc 24240 
aagaggaaac tccctttctt ggagactctc tcactagaac tttccagagg tgattcaggg 24300 
gacaagagaa taattgtcct taggcagact ctttttcaag ctggtcccag agctttccct 24360 
cttgccagtt aattggttta aggacacagt tgcacatcct tgccttgcct ctgctgctgt 24420 
cctctgcctt tctgtctgtt ctgagttata gcctttcaca tcagtcctgt actccccaaa 24480 
ctccaaggag cacaagtcag atcatctaag tgatcctctt gaagcctctt gtttaagatg 24540 
ggggaagcac ccttcctttt ccatggcact ctggcattcc aacaacactt taaataattt 24600 
tttctctcaa aattcttaag cctctcctct ttaatccttc gccattttta tgtattatta 24660 
ctttatatga tgagctaaga gttacaaaac tggtttttag aaatctcctt agcaaatgtt 24720 
ttactgctag tttagcagct cactttataa taaggatata tgatatattt ctttggttcc 24780 
tctgcctctg ggacctcagc tcatcctgag gcagagagtc ccattttaac attctgttac 24840 
ataaaccagt ggcaaaatgg ctttaacctg agggtaataa ttaccaggaa caaacagaaa 24900 
acagaaaaaa agtaaactgg ttatgatatc tgagtccctt ccctccctca tcctcacagg 24960 
gaccagctgg gtgagatgtc gtacaccaca atgtgcatca aggagacgtg ccgattgatt 25020 
cctgcagtcc cgtccatttc cagagatctc agcaagccac ttaccttccc agatggatgc 25080 
acattgcctg caggtcttta cattcttttc ctaagcagtt cttagaggct atgggatcct 25140 
ggagaccaca gtgacaaaga ttagtgagtc tcttagcact tggagaagtc aaaagataat 25200 
gctaacatgt gacttaggtt ttatcaccta tgaggagctc agaggataat gctttggtca 25260 
gacatgaatt tcaatgactt tcccaaaggc acatagccag ttgcagcaaa gctaagccca 25320 
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gaatccatgt ctctggaatc ccagcccagg gtctcttcca ttgtgggaca tcatttctaa 25380 
gataatcttt gtttggctga gtttgagacc gagctgaaac ttcatggaaa atagcaccag 25440 
catctttatc tgaaagacca agggggatct ttggcctcat catcataata tcacccttat 25500 
aaatatacaa catttaatag ttaatataga gccttcagac ccattatctc atttttcccc 25560 
ttggaatcca atgttaacag atgcttatac aatgatttac agttcactga acacttttaa 25620 
gtactttcaa tgtggcccaa aatccagagg cagccccaat gtgtagatga cattaactga 25680 
tgtgagcaga gctagaactt gtgcggagac cctgagtctg gagcctagag ttcttcggaa 25740 
caacacaggt ttctgagcag ggcttatagg aagcagaggg gtcatgtgag acatattatc 25800 
tgattcaatg ttctattaat tcatgtctta ggaagcaagc caacaggatt gcttctggca 25860 
aacacctaca gcctgttact gtaactttgc tgacagaccc agaattaatt tctggaagct 25920 
agaattattt ctggaaacca aataaccctc acattctctc tcctttgttt tgtactctgt 25980 
ttctccccaa accacatgga tatttgccaa aattctccac tttccatatg tgaatagcac 26040 
caatggaaat ttgtcatggg atctgcatga cagaatcaca gttctgtgtg tgtgtgtgtg 26100 
cgttttcctc tcaagacaga gtcttgctat gtagcccagg ctggagtaca gtggcgtaat 26160 
ctcggctcac tgcaacctct gcctcccagg tttaagcagt tctcctgcct cagcctcccg 26220 
agtagctggg attacaggtg cacaccacgc ctggcaaatt tttgtatttt tattagagat 26280 
ggggtttcac catgttggcc aggctagtct caagctcctg atctcgagac cagccctcct 26340 
cagcctccca aagcgctggg actacagcca tgagccactg cacccagcca gttctgtgct 26400 
tttataccta aattgtctcc aggagtgctt aatagtccat taataggtat ttaggccagg 26460 
cacagtggct gacgcatata atcccaatat tttgtgacac caaggtggga agactgcttg 26520 

aagttaggag tctgagacta gcctgggcaa catagggaga ccctgtcttt acaaaaaaaa 26580 20 

aaaagagaga gatagccagg catggtgttg catgcttgta ttcctgccta cttgggggac 26640 

tgaggcagga ggatcacttg agctcagaag ttcaaggtta ccgtgagcaa tgttcacgcc 26700 

actgctctcc agcctgattg acaggccaga ccctgactct aaacaaaaac aaaaaacaaa 26760 

tatttaagta atttccaaac atagcagaaa atataagcat ggtttatcac tttgatatga 26820 

caccaacagc tacttaagat agagtcatga attcagtaaa ttgttgtgtg gaaagctaag 26880 

gtgccaaccc aagccgcatc ttcttaggtg ctcctcactg gtgtcatcag ctacagcagg 26940 

cagagcattg ccaggagcta gctcttccct tcaagaacaa aagtcttgtt taagagcaca 27000 

gtagcccaca acttgctctt tctcctgcag tctcttttat ttccctcctt tcttagggat 27060 
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caccgtggtt cttagtattt ggggtcttca 
ggtatgattc tctcttgtac ataaatactt 
tggtagctaa gcacagaagt ggctatataa 
taaacataaa agccaaaaga aatgtaaaac 
gtatcagtga tttctttcat gtaagccact 
agctggagta tatgtctctg taataattgg 
tggatgcaca tccatttcta agtggatgta 
tagtattttt gtttgcctgg gtatcagaca 
tctgaaggta cactgcccag tgtagtagcc 
catgtggcca gttggaattg agttgtgctg 
acatagtacc ctaaaaaaat gtgaaacatt 
ggttggaatg gtaatttttg gttaaataaa 
atgtgaccac cagaacattt taaattacac 
gtgctaggtg gtaggtgaag aaatgtgttc 
cctctcattt caggtctttg accccttgag 
cccctatgcc tacttaccat tctcagctgg 
aagtacccaa agatgtttac ttgagagtag 
attcttccag ggaaccgtag atcttggtgc 
tacaaaggac aatcgtattc tctgtcacat 
acaatgtaag ctactgctca taggctcaat 
ttcatgagta actccaactg ccgccttgtt 
gctcaaattc tcacagtgaa caatttaagt 
tggaaaaaat atcactttac tgtgtacttc 
cagaagaaac atcatttttt caagtatcac 
ttgggcagga gtttgccatg attgagttaa 
tcagagtgac tccagacccc accaggcctc 
ccaagaatgg gatgtatttg cacctgaaga 
atgattaaac gtactttgtt tttcgaagtt 
aaagggatca atgtatggtg ggaggattgg 



ccacaaccct gctgtctgga aaaacccaaa 27120 
ccaagaacta atgctgtgca agtcactttt 27180 
ttaagggaaa tgacacaaat taaacaaaaa 27240 
tattctatgt tcttgaaaca ctcttgacgt 27300 
aaggtttaag atctattact tgtaacagga 27360 
ccacatcatc attttgactt gatttctaag 27420 
tctccatagt gaaaataata ccacttgcca 27480 
aatcagctgt gaagctgcaa ggtctgcagg 27540 
acgggccaca tacggctact gagcacatga 27600 
taagtttaaa atacgtgctg gattttgaag 27660 
tccttttagt aattatttat attgattaca 27720 
ctctattaag attaacttca ccttttaaaa 27780 
atgtagatca cattatattt ctattgatcg 27840 
atgttgtttg ggggatggtg ttggggttgt 27900 
gttctctcag gagaattctg atcagagaca 27960 
atcaaggtga gaacaatttg aagttgctga 28020 
tttattcctt tcagctcctc agctctatac 28080 
ctatttgagc cccaaaggat cagttagttt 28140 
cctttttggc catgcctcaa aagcagtccc 28200 
gcagtccacc ttcaaagcaa gagaaataat 28260 
atagggaagg catcatgttg gagcctccca 28320 
ctaaagttca aaagtttcaa tggcatttgg 28380 
agacttcttg tactagtatt ttactatagt 28440 
tttctttccc tcttgtcttc aggaactgca 28500 
aggtaaccat tgccttgatt ctgctccact 28560 
ttactttccc caaccatttt atcctcaagc 28620 
aactctctga atgttagatc tcagggtaca 28680 
aaatttacag ctaatgatcc aagcagatag 28740 
aggttggtgg gataggggtc tctgtgaaga 28800 
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gatccaaaat catttctagg tacacagtgt gtcagctaga tctgtttcta tataactttg 28860 
ggagattttc agatcttttc tgttaaactt tcactactat taatgctgta tacaccaata 28920 
gactttcata tattttctgt tgtttttaaa atagttttca gaattatgca agtaataagt 28980 
gcatgtatgc tcactgtcaa aaattcccaa cactagaaaa tcatgtagaa taaaaatttt 29040 
aaatctcact tcacttagcc gacattccat gccctgacca atcctactgc ttttcctaaa 29100 
aacagaataa tttggtgtgc attctttcag actttttcct atacatttta tatgtagaaa 29160 
tgtagcaatg tatttgtata gatgtgatca ttcctatatt gttattgatt tttttcactt 29220 
aataaaaatt caccttattc cttatcattg ctttatggta ttctgtaata tgaatgtact 29280 
ataatttatt taactatttt ccttattggg catttaagtt atttctagtt ttaaaaacat 29340 
gcttgtcaat ggcaacaaaa gccaaaattg acaaatggga tctaattaaa ctaaagagct 29400 
tctgcacagc aaaacaaact accatcacac tgaatgggca gcctacagaa tgggagaaaa 29460 
tttttgcaac ctactcatct gacaaaggcc taatatccag aatctacaat gaactcaaac 29520 
aaatgtacaa gaaaaaaaca accccatcaa aaagtgggtg aaggatatga acagacactt 29580 
ctcaaaagaa gacatttacg cagccaaaag acacatgaaa aaatgcctat cgtcactggc 29640 
catcagagaa atgcaaatca aaaccacaat gagataccat ctcacaccag ttagaatggc 29700 
aatcattaaa aagtcaggaa acaacaggtg ctggagagga tgtggagaaa taggaagact 29760 
tttacactgt tggtggcagg agaatcactt gaacccggga gggggaggtt gcagtgagcc 29820 
gaggtggcgc cactgcactc cagcctgggc gacagaacga gtactccatc tcaaaaaaaa 29880 
aaaaaaagga caccaaactt ctcaatctta atgttgtcat ctatgtggta tcttccataa 29940 
tctctctcag acagagtcat cttttgctga tatgatctta cagtattttt tgtttatacc 30000 
attataatct cattaattgc agcaacacaa atgacaaaag acaactgatt tctccccttg 30060 
gatgacctaa tttgctttca ctcttccatc atcacttata acatgatgat tctcaaattc 30120 
atctacctaa aatctatata taaaaaaatc cctcccttga attccagatc cttggagaca 30180 
aacacccacg tctaaaacca aatttgttta acactggacc agtcgtcctg tgtgactttc 30240 
cattttgtca ctattttgtc agctggtata ccaatatcca cccagttaaa caatatttcc 30300 
ttgttttttt ctggtacaaa cccaaataaa ttacaaacat caataaaagt aaaattctaa 30360 
aataactcac tttctctata tatctccttc ttgctggaaa aatgggttag gttagttctt 30420 
taaaagcatg catgataaat tgtactgaat acaatattca ggtctggaca tactaggtat 30480 
aattttctgt gtctctgggg tcttacctat ttggggtcaa aataaacaag tttattaagc 30540 
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ttattaatat tcaatttcat tatcttcttt aacaattatg ttccctggta gtttcattgc 30600 
caataattta tttgtcaggt tgccaggtgc ttctaaactt ctgtgtattt tttcatatcc 30660 
aattttactt taaatatttt tagaaaagag gtctgttaaa tttcctaata attattatat 30720 
tattgttttt tcactgacat tttgtgaatt gaaaaccctt aaaaatatga aatcattttt 30780 
tcgaaatatg tgccacagac aattttgtta aataagaaga cagaaacagg gcattatcaa 30840 
gagataaata ttcaatatac cttatatttc tgtcacacat ttttatacca actgtgccaa 30900 
aaattgtata tcatataaat gataacaagt tcacaaaggc attcctttat cccttaactc 30960 
tcaaattaga aactttcata ggtaggaagt aggggaagca tatattccct ttgaaaggtg 31020 
caagaaaatg tcattggcat tcaccatggt actcttcaag cttaaaaaaa atggactgca 31080 
aaacatttac aaacatagca tatttattgg gtacctttat gtttacataa atattgaaga 31140 
tatctcacat acctctttca atcagattat ctcactgaca tttattgacc actttctatg 31200 

31208 

<210> 4 
(211> 489 
<212> PRT 

<213> Homo sapiens 
<400> 4 

Val Ala Ala Leu Leu Gly Leu Leu Leu Leu Leu Leu Lys Ala Ala Gin 

15 10 15 20 

Leu Tyr Leu His Arg Gin Trp Leu Leu Arg Ala Leu G1d Gin Phe Pro 

20 25 30 

Cys Pro Pro Phe His Trp Leu Leu Gly His Ser Arg Glu Phe Gin Asn 

35 40 45 

Asp Gin Glu Leu Glu Arg lie Gin Lys Trp Val Glu Lys Phe Pro Gly 

50 55 60 

Ala Cys Pro Trp Trp Leu Ser Gly Asn Lys Ala Arg Leu Leu Val Tyr 
65 70 75 80 
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Asp Pro Asp Tyr Leu Lys Val lie Leu Gly Arg Ser Asp Pro Lys Ala 

85 90 95 

Pro Arg Aso Tyr Lys Leu Met Thr Pro Trp lie Gly Tyr Gly Leu Leu 

100 105 110 

Leu Leu Asp Gly Gin Thr Trp Phe Gin His Arg Arg Met Leu Thr Pro 

115 120 125 

Ala Phe His Tyr Asp lie Leu Lys Pro Tyr Val Gly Leu Met Val Asp 

130 135 140 

Ser Val Glo lie Met Leu Asp Arg Trp Glu Glo Leu He Ser Gin Asp 
145 150 155 160 

Ser Ser Leu Glu lie Phe Gin His Val Ser Leu Met Thr Leu Asp Thr 

165 170 175 

lie Met Lys Cys Ala Phe Ser Tyr Gin Gly Ser Val Gin Leu Asp Arg 

180 185 190 

Asn Ser His Ser Tyr lie Gin Ala He Asn Asp Leu Asn Asn Leu Val 

195 200 205 

Phe Tyr Arg Ala Arg Asn Val Phe His Gin Ser Asp Phe Leu Tyr Arg 

210 215 220 

Leu Ser Pro Glu Gly Arg Leu Phe His Arg Ala Cys Gin Leu Ala His 
225 230 235 240 

Glu His Thr Asp Arg Val He Gin Gin Arg Lys Ala Gin Leu Gin Gin 

245 250 255 

Glu Gly Glu Leu Glu Lys Val Arg Arg Lys Arg Arg Leu Asp Phe Leu 

260 265 270 

Asp Val Leu Leu Phe Ala Lys Met Glu Asn Gly Ser Ser Leu Ser Asp 

275 280 285 

Gin Asp Leu Arg Ala Glu Val Asp Thr Phe Met Phe Glu Gly His Asp 

290 295 300 

Thr Thr Ala Ser Gly Val Ser Trp He Phe Tyr Ala Leu Ala Thr His 



10 



20 
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305 310 315 320 

Pro Glu His Glo His Arg Cys Arg Glu Glu lie Gin Gly Leu Leu Giy 

325 330 335 

Asp Gly Ala Ser He Thr Trp Glu His Leu Asp Gin Met Pro Tyr Thr 

340 345 350 

Thr Met Cys He Lys Glu Ala Leu Arg Leu Tyr Pro Pro Val Pro Ser 

355 360 365 

Val Thr Arg Gin Leu Ser Lys Pro Val Thr Phe Pro Asp Gly Arg Ser 

370 375 380 

Leu Pro Lys Gly Val He Leu Phe Leu Ser He Tyr Gly Leu His Tyr 
385 390 395 400 

Asn Pro Lys Val Trp GU Asn Pro Glu Val Phe Asp Pro Phe Arg Phe 

405 410 415 

Ala Pro Asp Ser Ala Tyr His Ser His Ala Phe Leu Pro Phe Ser Gly 

420 425 430 

Gly Ala Arg Asn Cys He Gly Lys Gin Phe Ala Met Arg Glu Leu Lys 

435 440 445 

Val Ala Val Ala Leu Thr Leu Leu Arg Phe Glu Leu Leu Pro Asp Pro 

450 455 460 

Thr Arg Val Pro He Pro He Ala Arg Val Val Leu Lys Ser Lys Asn 
465 470 475 480 

Gly He His Leu Arg Leu Arg Lys Leu 

485 



10 



20 
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[0 1 ] 

i cccgcrecc icciCTcccc abbcowcc naxccTCtt n creccrra 

si cntnccCB ccactcacaa gcttccxac cccco&ala o ccm tbbcc 

un tkecwox Ttcuxjccr ccgcccgcea cjuaecccat c c rCTtco uc 

151 gccccacgaa a cogcccsc s mcccg re cgcagagcca raw not 

zm cnmm aoociua aau uit nacc raBS ncciwrn 

2S1 HXTGGCTXT CB6BC7BCTC CAGECCA7TA AGCTETACCT CCCCACGCAC 

301 OGGCTGCIGC GGGACCTGCC CCCCTTCCCA GCGCCCCGCA CCCACFGC7T 

SSI CCTTGGGCAC CA&AASTTTA TTCAGSATCA TAACATGSAC AAGCTTGAGS 

401 AAATTATTCA AAAATACCCT C OT CCT IC C CTTTCTGUI r M BCC CT TT 

4S1 CACGCATTTT TCTCTATCTA TSACCCACAC TATSCAAAGA CACTTCTCAC 

501 CACAACACAT CCCAACTCCC GCTACCTCCA CAAATTCTCA CCTCCACTTC 

SSI 7TCSAAAAGC ACTAGCGCCT CTAGACGGAC CCAACTGCTT CCAGCATCCT 

601 CGCCTACTAA CTCCTGGA77 CCATTTTAAC ATTXTCAAAC CATACATTCA 

651 CGTUTGGCT CATTCTCTGA AAATtATCCT CCATAACTGC CACAAUTTT 

701 CCAGCACTU GCACACAAGC CTttACCTtT ATCAGCACAT CAACTOUTC 

751 TCTCTGCATA TAATCATGAA ATGCCCTTTC AGCAAGCAGA CCAACTCCCA 

BD1 CACAAACAGC ACCCATCA7C CTTATGCAAA AGCCATATTT CAACTCACCA 

SSI AAATCATATT TCACC6CTT6 TACACTTTCT T6TATCACAC TCACATAATT 

901 ITCAAACTtA GCCCTCACCC C7AC0CCTTC CAGAACTTAA CCCCACTCTT 

951 CAATtACTAC acautacaa taatccagga aacaaagaaa r a zn xA GG 

1001 CTGGGCTAAA GCAGGATAAC ACTCCEAAfiA GCAAC7ACCA GEATTTTCTC 

10S1 GATATTETTX 7TTCTGCCAA CCATGAAAG7 CCTACCAGCT TCTCAGATAT 

1101 TCATETACAC TC7GAACT6A CCACATTCCT CTTGCCAGCA CATGACACCT 

1151 rCCCAGCAAC CATCTCCTC6 ATCCTTTACT B C CTSCCTtT GAACCCTCAC 

1201 CATCAAGA&A GATGCCGGGA GSAGCTCAGG GGCATCCTCS GGGATGSGTC 

1253 ITCTATCACT TGSGACCAGC TCG67CAGAT 6T0GTACACC ACAATETGCA 

1101 rCAACSAGAC 6TCC0EA7TC ATTCCTCCAG TCC06TCCAT TTCCAGAfiAT 

13S1 CTCAGCAACC CACTTACCTT CCCAGATGSA TCCACATTCC CTGCAGG&A7 

1401 CA CCCTBCT T CTTAGTATTT GGGGTCTTCA CCACAACCCT 6CTCC7S7CT 



1453 GCAAAAACCC AAAS5TCTTT CAJXCCTTCA GBTCTCTCA QUSAATTCT 

1503 CATCASASAC A0CCCTAT6C CTACTTACCA TTCTCACCT6 CATCAAEGAA 

1553 CTGCATTG6S CACSACTTTS CCATCATTCA GTTAAAGCTA ACCATTSCCT 

1601 TGATTCrCCT OAOTtAfiA KTGACTCCAC ACCCCACCAG GCCTCTTACT 

1651 TTCCCCAACC AITTTAICCT CAAGCCCAAfi AATCQCATST ATTTCtACCT 

1701 CAAUAACTC TCTCAAICTT ACA7CTCA66 CTACAA7CA7 TAAA0G7ACT 

1751 Il U li ntt AACTTAAATT IACAGCTAAT CATtCAAGCA GATAGAAACG 

1801 CATCAATGT4 TCCTGGGASG ATTGGAGC7T GCTGGCATAC BGCTCTCTtl 

1SS1 GAA&AGAtCC AAAATCA7TT CTAGCTACAC ASTCTGTCAG CTAfiATtTCT 

1901 TTtTATAfAA CTTTEGCACA TTTTCAfiATX TTTTCTCTTA AACTTTCACT 

1953 AC7ATTAATC CTCTATACAC CAATACACTT TtATATATTT rCTETTSTTl 

2001 TTAAAATACT TTTCAfiAAfT ATSCAACTAA fAAGTGCATC TATGCTCACT 

2053 CTCAAAAATT CCCAAtttf A GAAAATXATC tAGAATAAAA ATTTTAAATt 

noi TUcntACT tagcdsacat rtXATGCCn gaccaatcct AC7GC7TTTC 

2351 CTAAAAACAE AA7AA7TTGG rCTGCATTa TTCAfiATTTT TTCCTATACA 
2203 TTTTATAICT ACAAA7GTAG CAATC7ATT7 ETA7AGATCT GATCA7TCC7 
2253 ATATTCTTAT 7CA7TTTTT1 CACTTAATAA AAATTCACCT TA7TCCTTAA 
2303 AAAAAAAAAA AAAAAAAAAA AAAAAAA 
(SBO. tD S0:1) 

RA1UBS: 

5*071: 1-1*9 

Stir! CaJot: 190 

Stop Codes: 1720 

3* 071: 1723-2327 



Homologous proteins: 
Top 10 BLAST Hits 

«i 1 21 !73C9|pir| I A29368 prostaglandin oocga-hydroxyl&se (EC 1.14.. 
gi|t 17166 |spiP1061l|CP44JtABlT CYTOCHROME P450 4A4 (CYPIVA4) (P.. 
gl 1 164981 1 gb| AAA31232. 1 | (J02818) cytochrooe P-450p-2 [Oryctola. . 
gl| 1656|emb|CAA40493. J | (X57209) omegB-hydroxylese cytochrooe.. 
gi |89989|pir| 1A34260 leurate ooega-hydroxylase (EC 1.14.15.3) c. . 
gill 17167 1 sp|P14579|CP45_RABIT CYTOCHROME P450 4AS PRECURSOR (C. . 
gi 1203787 j gb I AAA4 1038. 1| (M57718) cytochrome P-450 1VA1 [Rsttus. . 
gi 1899921 pir I IB34160 cytochrooe P450 4A7 - rabbit >gi Il64985|gb. . 
gi I 3738263 |dbjiBAA33804. 1 1 (AB018421) cytochrooo P-450 [k*us bus.. 
gi|8393238]ref|NP_058G9S. 1| cytochrooo P450. subfanily IVB, pol. . 

BLAST to dbEST: 

gb|AT812435|AT81243S Oll-ST0]81-261099-026-a02 ST0181 Hooo sapi. . 
gb|R56515|R56515 yg94d06. rl Soares infant brain 1NIB Hooo sapie. . 
gb|AA33730l|AA337301 EST42040 Endometrial - tuaor Hooo sapiens cD. . 
gb j AA652746 1 AA652746 n s 65c 09. al NCI_CCAP_Pr22 Homo sapiens cDNA. . 
gb|AA863360|AA863360 oh04f03. el NCI_CGAP_Kid3 Hcaso sapiens cDNA. . 
gb|AA319338|AA3l9338 EST21550 Adrenal gland tuaor Hooo sapiens .. 
gb|BP3559D3|BP355963 CMl-HT0878-060900-398-b08 HT0878 Hooo sapi. . 
gb|BF44582S|BF445825 nae41d04. xl Lupski_syav>othetic_trunk Hooo .. 
Bb|AA557324|AAS57324 nl81a02. si KCl_CGAP_Br2 Hooo sapiens cDNA .. 
Kb|AV683266|AV683266 AV683266 CKC Bono sapiens cCNA clone CKCDQ. . 
gb I AT254444 1 A»264444 xr03d03. xl NCl_CGAP_JBmS3 Hotao sapiens cON. . 



Score 
521 
620 
520 
518 
517 
516 
510 
510 
509 
508 



Score 
1092 
769 
640 
636 
599 
555 
381 
365 
357 
323 
242 



E 

e-146 
e-146 
e-146 
o-146 
e-145 
e-145 
e-143 
e-143 
•-143 
e-143 



E 

0.0 

0.0 

0.0 

e-180 

e-168 

e-155 

e-103 

5e-98 

le-95 

2e-8S 

5e-61 



EOTESSIOH IXTOKKATIW FOB MODDUrOZT OS: 
library toarce: 

BtPrciiloi Utorsetlei Iroe BLAST dWST hits: 
s>|A181243SISloaacs 
rtltsesist Swrei Ufut trail 1HII 
0IAA3373O1I EBflosetrUl tuaor 
s>lAA652746| loraal proitate 
stlAA8633G0| kldsey 
gblAA310338| Adreut glud tuor 
gb I BT355963 Ikesd seek 
gblKF44582S| Lapikl.iyKpat&etle.iroilt 
gblAAS57324| brew l 
gt|AT6B32S6l hepstocellnlsr caret I 
gb|AI264444| br.lt 



Bxpreitjos.Ufor 
Ibete brals 



Hoi free rO-b iic4 tlii ae i creeilt psseli: 
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[02] 

l EPsmrn adttuct oaioiqai arusgm numwr 
a rmvsxr iQsnrme EitEcrnur micwojiy rTirvsTAi 
icq Tusznns moron imjulcc rtirgsmj. ttcthtsiu 

151 ATtETIABST DKJX1ZX1 CSnjDTSTCT TEB1SSGLD I VBUISB 

201 re cq ns TEP ruuimi nnmm ithsbufp. srgcrpm 

2S1 amJBJTTTJT IIQESXXSLQ ACTKafTTt LSACESCSS 

301 rsDionsET smuiEDT uASiniu cuasreae tmmsu 
3si csGssrnco laxsrmc imauM tkioeui pLTTncni 
40i FAGimui mrowT mmrm etsoessdoi hftaiutsa 

451 CSKSCIGQE7 AMZEUTTU ULWITTT VfTSTlTfTH KPILOTKOI 
S01 TLHLEKLSEC 

runflDB: 

Fuclloaal ooulu asd key rcgloai: 

[i] raoaxnoi rsooooi Aa.aramunai 

8-g1yco*yUtloa ilte 

206-209 H3TB 

[s] raocoooM PS00004 onrjPBOsrso.siTe 

tAO- ted cCMP-depeadeit proieli klaase peoiptorylil Joi tile 
Rnfeei ol utcbet: 2 

1 265-268 BUS 

2 505-508 Oli 

[3] raocoooos rsoooos KC.reosrao.stn 

rtoteli kluie C pbotpfcorylaltoi ilte 
llufccr gl ntckei: 4 

1 159-161 STI 

2 273-250 TW 



3 292-294 SAX 

4 3T4-376 TO 

(4) reocoooo6 psocxjk cajasnojun 

bull kiuu II ptatnbcrrlaitoj ilie 
Ruber ol ntebes: 9 



1 


4-7 sua 


2 


104-107 sm> 


3 


172-175 STQB 


4 


176-179 TSTB 


5 


207-210 STBS 


6 


2V2-Z85 SAD 


7 


300-303 SFSO 


S 


302-305 SSI0 


9 


393-398 TFTO 



[5) raocoooos psoooos mum 

B-ayi 1 1 lorlat loi ilie 
Voter el niche*: 5 

1 25-30 OXQAI 

2 298-303 GSSns 

3 353-358 CS3IT9 

4 451-456 GSKXCI 

5 457-462 CQEFAI 

[6} roocoooai rsoara cttochkoxs.m&o 
Ct lock rest P450 cyilelte betc-lro* ! lEaid ilmtore 
448-457 rSASSBKIfi 



»e^iiie_iP4«iH.iimlsix.u4J»»lJii 
Belli legii 8*8 Scare Certalaiy 

1 12 32 1.63S Certala 

2 76 98 1.029 CerUla 

3 316 336 1.077 CerUll 

4 395 415 1.443 Certala 



BUST Allgnnmt to Top Hit: 

>gi|2117369|pir|lA29368 prostaglandin onega-hydroxy 1 ase (EC 1.14. IS.-) 
cytochrome P450 4A4 - rabbit 
Length = S10 



Brer search reivlti CflaaJ: 

fefei DeictiDtlca 

7F00067 
CE003&3 



Jsait B-Ttlw B_ 



Cytochrome P450 
800363 glyclie.reeept or.be ta 



416.5 
2.1 



2.5c- 121 
4.7 



Score ■ 521 bits (1328), Expect ■ e-146 

Identities = 246/493 (49%). Positives = 3SS/493 (71%), Cope = 1/493 (OS) 
Fraraa = +1 

Query: 235 LAFYFCIALGLL£AIKLY\JIRQRUJ 411 

♦A + L L LL+A +LYL RQ LLR L+ FP PP Iff LCH ■» Q-*D *E 
Sbjct: 21 VAALLGLLLLUJCAAQLYUlRCmJWLQQFrXPm 80 

Query: 412 K YP RAFPFT I GPFQAFFC I VTJPDYAKTLLSRTOPKSRYl^KFSPPLLGKCLWLJ^GPJW 591 

IfP A P+W+ +A +YDPDY K +L R+0PK+ K P *C CL LOG »F 
Sbjct: 81 ^TCACPWt^SG1fl^LVTDFWLKVIU3^^ 140 

Query: 592 QHRFLLTPGFHFN 1 LKAY1 E VUAHSVKMMLDXTCK I CST QDTSVEVYEH I JfSUSLD 1 1 MX 771 

QHRR*LTP FH++1LK Yt SV+HflJHffE-tf S QD+S+E+ «-»L0 ItOC 
Sbjct: 141 QHRRMLTPAFHYD I LKPYYG UIYDSVQ I IflXHflFEQL I S-QDSSLE I FQHVSLMTLDT I ktt 199 

Query: 772 CAFSKETT^rySTh^PYALAIFEI^I IFHRLVS1XYHS01 IFKLSPQGYRPQKLSRVL 951 

CAPS + * Q + Y +AI +L+ •h+F+R ++ ♦ SD +++LSP-KJ P t -m 
Sbjct: 200 CAFSYQGS VQLORNSHS Y I OA I hfiXJWl VPYRARNVFTOSDFl^ 259 

Query: 952 NQYTDTI 10£RIUCSWAGYKQDNTPKRKYQDFU)IVlSVuXDr^ 1131 

♦++TD +IQ+RK LQ ♦ ♦ DFUH+L AX E+GSS SD [H +EY TF+ 

Sbjct: 260 HEHTDRVIQQSKAQUK^ELEKVRRKRRlJi^VUJ 7 319 

Query: 1132 LAGHDTLAASIS^ILYCUlJ.T^QERCREcVRGILGC« 1311 

GKTJT A+ +SV1 Y LA +PEHQ RCREE++GHXJDG+SITW+ L +M YTTMCIKB 
Sbjct: 320 PEGHDmSGVSTlFYAUTHPEHQHRCREEIQGLlXDGASITOK^ 379 

Query: 1312 CfaiPAVPSISrTW^B'LTFPDGCTU'AGimLSITCUlr^ 1491 

RL P VPS++R LSKP+TFPDG HP G+ ♦ LSTtGLH^NP V?+t1pVVFuP RF* 
Sbjct: 380 LRLYPPVPSVTTOI^KPVITTTCRSLPKGYI LFLS lYGLHYW-KVPAWPEYFuPFRFAP 438 

Query: 1492 ElCDQRhTYAYlJFSAGSRrWIMEPAMIELJCVTIALIUJO^ 1671 

*+* H ♦A+LPFS G*RNCIG* *FAM ELKV +-AL LL F ♦ POPTR *L 
Sbjct: 439 DSA — YrCHAFUTSGGARNCIGIQFAJiRELKVAVALT^^ 496 



farted (or dotal u: 

Mcl BbmAi icrt uazl l**-t &«=t- 1 



■K9«-BrTllBC 



7700067 



1/1 
W 



210 
46 



233 
604 



481 
1 



504 .] 
497 [] 



2.1 4.7 
416,5 2.5*-121 



Query: 1672 KPHNGMYLHLKXL 1710 

K WVG++L L*KL 
Sbjct: 497 KSKNGIHLRLRKL 609 
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[ 



1 
SI 
101 
151 

2m 
za 

301 
351 
401 
451 
501 
561 
601 
651 
T01 

Tsa 

801 
551 
001 
BS1 
1001 
10S1 
1101 
1151 
1201 
1251 
1301 
1353 
1401 



3 ] 

CCASXTTTt 

ttgttaccca 
ctgcagcaga 

ALUXMXA& 
GGtTCTGACC 
CTAT7CCTCG 
AUMECKA 

axxma/L 



cagttaccca 
mtaacaaca 
tgggaaagaa 
caaggcgcca 
ctstcttttt 

ctctccktt 
tacagccccc 

TGGGG77TU 

nxscccccc 

GGGGAAGGTG 
AAATTGG6T7 
TCTTAGCCTA 
GTGGGCAATC 
CAGCTTGGGA 
ACAAATACTA 
TATCTCTACT 
ATGGGCATGA 



TTAGBCItXT 
TGAAACtACA 
GGACTAACTS 
CAiBUtCTE 
CTGCCAAGAS 
CTCTCAT&AA 
AGTTTCAGCT 
CA7GCGATCA 
ATAGGGCATA 
GAAGAAGC7C 
TCTACTTGAA 



CCACCAACTT 

mrnrm 

GGACTCCAC7 
CAACCGATTt 
OCOCACCACC 
CCATCTTGCC 
TC BCCCTCC C 
CCCTGCACCC 
AGACTGGACC 
GGGGAGGGAC 
TATTCTTCAC 
AATCAATGAA 
CCTATAAAGA 
ACTCAGGCCT 
AAACCAIXAC 
C6AG7TTATT 
GACCCAAGCA 



AAATATAGTG 
tGGAACGCTC 
CATABAACTS 
GGTGGGAGTA 
AACTCATTAG 
GAACCCAGGG 
CTGGGGCACA 
TATGAGGGfX 
GACGAATCAT 
TCTAACCCAC 
CCTTGCACTT 

CAcn cc icT 

GGTCCACATG 

TTTTTTrrrr 

CCCSCCATCT 
TCCTGCCTCA 
CCCAGCTAAT 
CAGGATGCTC 
AAATTGCTGG 
TCTCTTAATC 
CTABGCTTAA 



CAAAAAGTTt 



IttAACCCTC 
GGTCAGtXAA 
AAGGTUCCA 
ACCEEACCAG 
GCATGGAICT 
ATCATACAAI 
ATGAAAAGCT 
AECATTtTCA 
AGCTAGCAGC 
CTCTCCCTTA 
AACTXACCAC 
TGAGAfXGAC 
OGGCTCACfC 
GCC7CCTGAC 
T7TAGTACTG 
TTCATCCCT7 
GAT7ACAGGC 
ACTTACCCCt 
GGAATTGGBC 



CAfiACTTCCT 
GCAACTCGCC 
AUBBDCIt 
GS6 6 TTC CC A 
ACCACACATA 
GCAAGATATC 
CMCTCTTTG 
CATCATGATT 
GAAATGCCAT 
GACCCTCTXA 
TAGGCAACTC 
ATGGAACATC 
CCCCTCCCGC 
TCTtACTCTG 
CAATCTTTGC 
TACCTGGGAT 
TTACTAGACA 
GACCTtSTGA 
CTCAGCCACC 
CAAATAAAAT 



AGCTGTCAAT 
TAAATGAAIG 
TTGCTGGGAC 
CTTAGArCCT 
TTTCAAATTA 
TATGTTTCTC 
GCCACACGAG 



LUCAOTAA 
AATAAAIAAA 



TTTTTACIAC 
GtATCTAGAG 
GCTCTGAACC 



ATTGTAGCAG 
CATATCCOC 
TGAATGAAAT 
GGGGAGACAC 
CTGATACGTT 
ATTTTCCTCT 
TCAGCCOTC 



1451 ta i ce r ect i 

1501 nZCGCATTT 

1551 GACTGTACCT 

1601 ATCTTCCOC 

1551 TCATGCTACA 

1701 GGGAACACCA 

1751 TCCCCCTACC 

1801 AGAAAA7CTC 

1551 ATGACCTCAG 

1001 TGAAAACTCT 

1051 CCCCCGAACG 

2001 c usc ci gcc 

2051 cttcttcccg 

2101 GTGGGCGACC 

2151 CGOCCCATU 

2201 CCTCCCTGGA 

2351 BCOCB D CC 

2301 GCGGCTCCTC 

2351 TCCTTGGGCA 

2401 AACACOCGCG 

2451 CGCAGCTTTC 

2501 AGCCCGCCCT 

2551 mCCOGCAG 

2601 TGCGAGACAC 

2851 TAACTCAAGT 

2701 TTTAAAAATA 

2751 CAAAATCTAT 

2801 A l l HH I I I 

2851 ACTOGAGAAA 



TTAATGETCT 
TCATTTTGCA 
CTATAGTGGA 
nCACTETTC 
CSCTGGGAAT 
CTAGGCAGGC 
TCTACCACCT 
TTGCATAGCt 
CTTTAGCATT 
TGGACTTAAA 
AAGCCTGCGC 

rccTCTtccc 

CSAGTCAGAA 
CTACGCCAGC 
AACCSCCGGC 
GAOGCGCTGC 

recGCCTGn 

CGGGACCTGC 
CCAGAAGGTA 
CG CA G CA G G A 
mXATCCCT 
GTGGCTCTTA 
CCCCACAGGG 
CTATTTAAAG 
CTGTCTCTCA 
GCATG7TAT7 
CCCAATCGTT 
CCTAACATTC 
TACTGATTGA 



CTtATCTTAG 
CTGAGATTCA 
ATTACTAIAT 
ACTGCCCTCE 
CAGGGTGGGA 
CALAGGATAA 
GGCAATTTTC 
ATTTATAA77 
ATTTTACAAT 
AGAG6ACCCT 
GCT7TCGTAA 
AGGCCTGAGC 
GCTTCGCGAG 
TCGGGCCGGG 
6TTCGGCGC1 
GCGOGGCCCT 
GGAGGCCATT 

aeeccnax 

AATGGAAGGG 
TGCGGCAGAG 
GCCGACCC I f ' 
CCATCATTTT 
AAAGCTCACA 
AACCTGAATA 
TTGCTTCACC 
CTAAATAACT 
GGtACCCTTA 
TTCTGAAGTA 
TGGAAATTTG 



AATTG7TAAT 
IAAATTATAT 
GATGGTACGC 

ATCAGTTGTA 
AGGAATAATC 
GTAGAATGCC 
TCTGA7AAGG 
ATAAATTCAC 
CCAGGACCGC 
CCGGCTAGAA 

raccc cT CCC 

GGCCCAGAGA 



GCGCAGAGCt 
TTTACCTGGC 
AAGCTGTACC 
ACCGCCCCCC 
AAAAAGBTTA 
C AG CCCAGCC 
CGCCTTCCAE 

rtrrTcaa 



AAACTTTTTA 
AGCAGGCCCT 
TACTGTGtAT 
TTGAACTAGC 
AACCATTTAC 
A i^GCXACACC 
AGAATGCAAA 
AACAAAAACA 

ATCCOCTGAC 
AAAAGCAGTT 
ATCCCGCACG 
ACTCCCTTTC 
GGCCCTGCGG 
CCCTCTCCCG 
ATGGAATTCT 
GTTCGTGTTC 



AAGOCTTCCA 
TATTACTTGC 
GTCCATTTTA 



ACCCACTGGT 
GAAAAGGAGG 
GGCAGAGAGA 
C GG C C m CC 
GGACAATTGC 
CTTTGG GGG C 
AACCGAGCTG 
CA TGTGTTGC 
ACAAAATATG 



tAAAGGGAGA CTGT7AGCTT 



2901 nGGTCTCTC CCSTTTTTTA AATCCACTGC CACCCCTAAT TAAGGTTTTT 

2951 ATTCATTCAA CCGACTCTGA GTGGCAATTG TGTGATAGGT ACTAASATTA 

3001 CAAAGAGAAG CTAAGTCCCT CCCCTGCACC ACCCAAGTCA GGTGCAGAC7 

3052 TAGGCCACAG AGAGAAAAIG AAAATTTAAG GCAATGGGTG CTTTACTAGA 

3101 GGCCTAGAGA CAAGGGAATA TCTGTCGGAG GAAAGTATAC ATXTCCGCCT 

3151 AGAGAAGGAA GGAAAGTCTG TGAAGBGCTG AGCAGAGTCT TAAAGGATGG 

3201 TTGGGTGGTC TGGGGAAGGC ATTCCAGCAG A£CTACTA£A CGATCCTTTG 

3251 GT7TC0CCAC nTCTACTCT T7CT7ATATA AAGCAACCAC ITTCAACTCT 

3301 TTTATCGGTT TCTTCTGCTA TTT AAA TACT TATTTGTAAA ATACTATTAC 

3351 CATATTGCAT CTATTAATTT AATAAGTTTA GACATCTGCT CTCCTTTACA 

3401 TATGGTTTGT TCGTCCCCAC CAAGCCTCAT CTTGAAATTT GATTCCCAAT 

3451 GTTGGAGGTG GGATCTGATG GGAGATCTTT GGCTCATTGG GATGGATtCC 

3501 TtATGAATCT C7T C CT CC AC CT C TCTCCTT CATAACTTCT CACTXTCTTA 

3551 GTCCCTCTTC AACCCCCACA ACTGATTCTT GAAAACAGCC TGCCACCTCC 

3601 rCCC C TCTCT CTTCCTGTCI CTCACCATGT CCTCTCTCC A CACAACTGCT 

3651 CCTGTTCACT TCCACTATCA GTGGAAGCAG TCTGAGATCC TCCCCACATG 

3701 CAGATGCCAA rCCCATGCTT CTTCTACAfiC CTGCAGAATT G7AACCCAAA 

3751 TAATCCTCTT TGTGAATGAC CCAGCCTCAG CTA7TCC7TT ACAGCAACAC 

3801 AAATGTACTA AGACAACATC CACCTATGAA C7TCT TT ATG ACAGGCAATC 

38S1 AC7TACAC7T CATAfTCCAC TGTCC C AGTA ACT A TAT ACT ATTG7ATTT7 

3001 TTAAATAGAA AAACTTCTAT TTGTATTATT T7TA7TATGC AAATGTTATT 

3SS1 TACTGCT G AT C7AAATGGTC CTCTTTCA TT TTA TTTCCT1 TTCTCATAGA 

4001 ACTTTTTCCC CACOCCCACA GTATTOOTOI KKOTOWNNt KMMfiMMUl 

4osi BBBmom nonoBonsn laoaaaaaoaf hsbbooooqi hqwmtom 

4101 kh oo owm maocoooaa loaoaaooca hu w kww ct aoonoooaa 

4i5i smooaoaa na o a o aa aai Baoaaaonai Boaoaooaoi Hoooonaaoi 

4201 HBBOBaqag j o agqogq ro htgttatgta tctctactct ctutgaata 

4251 CTATGTCCTC IGTTG7TTTA A7TGAATTCT TTTGGCATCC TTG7CAAAAA 

4301 TCAATTGACC ATAAATGTXA AGGTCTATTT CTGAGTtTTC AATTCTAATC 



4351 CATTGATCTA TATGTCTATt CTAACTCATG UCXCXMA GTAGAAGGAT 

4401 GCTTACCAAA GGCTGGGAAG GATAGAGGGG AGCTGGGGGA GCAGGTAGGG 

4451 AAGGTTAATG GGTACAAAAA AAATAGAAAG AATGAATAAC ACCTACTATT 

4501 TGATAGCAtA GCAGGGTGGC TATAGTCAAT AATAACTGTA CACTTTTAAA 

4551 TAAAGAGTG7 AATAGGATTG TTTGCAACTC AATGGATAAA TGCTTGAGGG 

4601 GA7GGGTACC CCATTCTTXA TGATGTGCCT ATTTCACATT GCATGCCTGT 

4651 ATCAAAAACA TCTCATTTAC TCCATAAATA TATACACCTA CTATGTATCC 

4701 ACAACTATTA AAAATTATAA ATAAATAAAT TATATAGCTA rCCTTATGCT 

4751 AG7ACCACAC TGCCnACTG TTGCTTTGTA CTAAGCTTTG AAATXAGCAA 

4801 GTATGAGTCC CCCGCACTTT CCTATT7TCC AAGATTATTT rGCCTGTTTG 

4851 GAATtCTTGA TTTCTATACA AATTTTAGAC fCAGCCTATC AATT7CTACA 

4901 AGGAAACCAG C7AGGCTTC7 GCTTGGGATT GCAC7GM7C TGTACATCAG 

4951 mCGCGATT ATTGCCATC7 TAAGAATATT AGCTCTTCTC ATCCATGAAC 

5001 ACAGAAACCC TTTCCGTTTA CTTAGCTCAT CTT7AATTTT TTTTCTTCTT 

5051 i u mu ii ttttcacaca cagtcctcct ctctccccca cgctggactg 

5101 CACTGACSCA ATCTCGGCTC ACTCCAACCT CCGCCTCTCG GATTCAAGCG 

5151 ATTCTCCTGC CTCAflXTCC CAAGCAGCTC GCACTACAGC CACATGCCAC 

5201 CACACCAACT AATTTTTGTA TTTTCACTAG AGACGGGCTT rCACCATAH 

5251 CBCCACGCTA GTCTCGAACT CCTGACCTGC TGATCCACCC GCCTCACCCT 

5301 CCCAAAGTGC TGGGATTACA GGCGTGAGCC A1XACTCCCG GCTTTCTT7A 

535] ATTTTTTTTA ACCATCTTT7 TGTAITTITC AAAGTATACA TCTT6CATTT 

5401 CTTTTGT7AA ATTTATT7GT ITTGT7C7T1 T7AA1TTCAT TTCAGACTAT 

5451 TTATTGCATT CATACTGTTT TAGACTCCAC ATTCCCTCTT GAC7GTCACT 

6501 AACTTTTTTT I I HCIUII TTGAGAGGTT TCTATCAGAA TTTTGCAGAT 

5551 CAGAGATGAC GGACATGTU AACTGTCTAA TAT7ACCAAC CCTCCCCAn 

5601 TATCAGATCA GGATCCTTT7 GGTGATTCAC CAIGCAGGGA AA TTT ACTA! 

5651 CTAAGGCTCA AAAGG7GAIA C7GTTTTACA TAGGCAGTAA CAT71TATTG 

5701 CTACATAATA ACT AHA T ATT TATGGAE7AC C7G7GATA7T TTGA7ACGTG 

5751 CATACAATGT GCACTGATtA AATCAGGGTG TTTAGGGTAT TUTCAOTt 
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5501 
5351 
5901 
59S1 
6001 

son 

6101 
6151 
6201 
62S1 
6301 
63S1 
6401 
6451 
6501 
6551 
6601 
6651 
6701 
6751 
6801 
6851 
6901 
6851 
T001 
7051 
7101 

na 

7201 



TAACATTTAT 
CTTCAGAAAT 
ACTTATTCCT 
TTACAGAAGC 
fTCAITXCTT 

cctocttgc 
ccaaattatt 
acctctacta 

TXACTCTtTC 
CACCACTAAC 
ACTATAAAOC 
TAAATCCm 
TATTCAAAGA 
ATTCAATACC 
CTGTATGTCl 
ATtTATCTAC 
AETTTGAATC 
TTTAACCTAC 
TAATCATATC 
AACATCGAAA 
CTTTXCTCT1 
CAATCTTTAC 

rrnrrecGC 



TACTTTCATA 
ATTCTCATAT 
TATATTTCTC 
CTTCTCTCTT 
TTCAAAACCA 



TATTTA7TTC 
ATTCAAIACA 
TCIATCTAAA 
ATAAACTCTC 
AOCTTACTA 
ACCTTTCAAC 
CCTAAAACAC 
CTCTCTATCT 
ACACTATACT 
TACAA7ATAA 
TACTGCATCC 

gttgaatgca 
icaatttact 
caacatatgg 
ctetccatct. 
ctaactagta 

CTtArTACTG 
FCTGTACCCC 
TACTTCATAG 
GCATTTAGCA 
GCTAfTTGGG 
TAAATATATC 
ATCAACAAAT 
CCTTTir C I t 
TCGCTCGCCT 
TACTTTACAA 
rCTATCTCTA 
TATTCTTTCT 
CTACTCTGCA 



PCTTTBCAAC 
TTATTCTTAA 
GACACTAACA 
TATAGBCAAA 
ATACA6CTCT 
ATCCACCTCT 
CCTTBCCCT6 
CCAACCCTAT 
CTATACTCTT 
AATCTCTCAG 
TACACTTCCT 
TAAATATATT 
CCAXAfiAAXA 
GACCTAGGAT 
GCATGTCTAC 
XAAAfiCAi^^^r 
CTTGCCAGCT 
TCAATTTCCT 
GCTTtTTGAT 
GCAOCTACTT 
GGCACTATGC 
TACTACCCCT 
AT AAA TT AAA 

CTTTATGAAA 
TAZTCATGAG 
CCACCTGGTA 
AAATGAATAA 
tACTCAGTGG 



ATTTCAACTt 
CACTGCTATT 
mTAAfiTAT 
ATTCCCTACA 
ICAAACATGt 
CTCCTTATAT 
ACCATTtACA 
TATTATTATC 
rTCCTTCTTC 
ACGTAGGATX 
GETGCATAAT 
AGCTGCTGAG 
CCCCAGCTGC 
CTACATATGC 
rrGGATGTAC 
CGGCTCCACA 
GTACACACTT 
CATCTCTAAA 
GTAAATATTA 
CATAGCACTC 
ATTTTCTCAA 
mtAACTCT 
TATACTACAC 
TTACAAAAAC 
ATTTTTTACC 
GGCAAAGGCC 
CCTCTWTAC 
A^^SAAAAAAX 
CTGCTTATCA 



TCTTCAAfiCT 
GAACACTGCA 
ACrUTAAGC 
AGATGAGAAT 
CAAGGATATT 
rCCTCTPCCT 
CTAAAATACC 

rraaxcTTA 

CTTTATTATC 

rrrcTTTBCC 

AGCTCCTCAA 
AAAATTTATT 
TTTCACAJTT 
AJGTCTCTC1 
rCCACACAAC 
CTTAAACTGG 
GGGCAGATCA 
CTAGCGATTG 
AATAACATAC 
CTTUTAAAT 
CATTTCTCAA 
ATTTACATCC 
TATTCACAAC 
TATAATGAAA 
TAAACAAACA 

itctcttcci 
acaataaata 
attcacattg 
ctcbcttcat 



7251 
7303 
7351 
7403 
7451 
7503 
7553 
7603 
7653 
7703 
7753 
7803 
7853 
7903 
7953 
8003 
8053 
8103 
8151 
8201 
82S1 
8301 
8351 
8401 
8451 
8501 
8553 
8601 
8651 



TATGCCAACA 
ACTCTtAACC 
TAAAATTGCT 
TCTXCAGAGG 
TCACACGACS 
TtCCACCATC 



TtACTCATTT 
AAACCAAAAA 
TCTTCCAACC 
CTCTCTAAA7 
TACATAGAGT 
AGAAATAATC 



AIAGGAAATA 
TCATCTTGCC 
CACAAGACTT 
CACCA7ATA7 
TTATCCCCCA 
CAAAACTCTC 
ACCTTCTCTC 
TTACTTT777 
CCAAATTATT 
TTCACCCATT 
AGCAGAACA6 
AACTCAAATC 

reccBCcm 
taccaatact 
caggtatgCa 



TTAACAAAAA 
TTTCACTGGC 
AMTUTAAt 

atca gag ca a 
aaaaatctgc 
atgctcaact 
txcactttac 
agacaaacaa 
ctccttctc1 
tcaaactttt 

CCACCTTCTG 
TGGCAATTTT 
ACTGCTGCAT 
TCTACGGTCA 
CTTCTTTTTA 
AAGTGAGGCA 
CACA£TAAAT 
TAAIATTGAG 
AGCTTAGAAC 
AGCCTCCTCT 

CAC7CCAC77 
CAAAAATACC 

i nun atc 

C7AAGAACAC 
CATAAAACCC 
AACACACACA 
GAAGGGGATT 
TGGGCACCTT 



ACT6CACTAT 
CTTGAJCTAT 



ACAATOCTCC 
CACCCTTAAG 
CTCTACATGG 
ACAACAACCA 
TTTAACTGAC 
CCTGCTACTA 
CTATTTGACT 
nCTTCGCGC 
TTAGACACAA 
AATTACAGAG 
TGCtCCATTT 
GCTTXtAGAT 
CAGACTCTGA 
CTCATTCAAC 
AAGATAAATA 
nXCAAACAA 
GCTTCTCTGA 
CACACCTCCA 
TATTCAGCAT 
CIC C ICttTT 
TATCACCCAG 
GGGGAAACCT 
ATA GC CAAGA 

COAIATCCT 
GAACTCGGTA 



TTTAGAAACT 
CCAGAGAACA 
GGGAGCCAGA 
AGCAGATACC 
ATTAIII ILt 
CCACAOAAA 
AATCTAAATC 
CCATTACTCA 
If tGCACTCT 
TTCCTTTAAC 
CACAATTTCC 
CCATTTACTC 
ACGAACAGGA 
ACTAACTGTT 
CtCACTXCAA 
CTCMCtU t 
AGTGACTTGG 
CACTAACTTT 
AGCATCTTAT 
CtTTATCAGC 

G trrecCTc e 

CATAACATGC 

c c on cm 

ACTATGCAAA 
CTGGCACC7A 
rrCCAAAGCA 
rCCGCAGCTC 
CtCCAGTCCC 
TAACTTAAAG 



ACCTTTCAAC 
CTTTATGGCT 
CTTCTTXACt 
CTCTCATTCC 
ACCCSCSCAI 
ACACACACAT 
AATCTTGTTT 
rCTGCTTCTA 
CATTTCTAAA 
ATTCACACAT 
ATTTTCTTTA 
CTAACTCATC 
ACAA£AAATC 
AGTTTCCCAC 
ICTSAGTGN 
CCTCTATTGC 
rCTGGGGGTA 
CTTTAGAGAA 
CTCACTTCCA 

tcctgaactc 
ctaattactt 
agaagcxxga 
attgggccct 
cacacttctg 
ttcctcctag 
aagattggtt 
AtAcemxt 

TTCTCTTCTT 
CCTAGCTGGC 



8701 
8751 
8801 
8851 
8901 
8951 
9001 
9051 
S101 
S161 
9201 
9251 
9301 
9351 
9401 
9451 
9501 
9551 
9601 
9651 
9701 
9751 
B801 
9851 
9901 
9951 
10001 
10051 
10101 



ATTACCAGAC TTGCCAGGCA AGGCTTtCCT 
ACTTCAETCT CAGCAACACT TCCCACTCCT 
GTCTtAAGAE GGTGGGAAAT CACCACTAAC 
ATCAAACCCT GAATGCTACA TCATTAATTT 



rCGCCTCTCT 
ACCCCTGCTC 
TCTACCTCTC 
ACCCATCAGA 



CCGTTTTATS 
ICGACCATAA 
CTGGTTCAGT 
CCTCTTGATH 



KK2CQQ3009 

nooBssnra 
Boocgaooa 
noooooooai 
noaaaaora 
noaoosoBai 

KSKCIKKKSK 



HH OOO g Sf J I 
KMOWKKJKH 

n a oooa u nsa 

KNQQOOQffiH 
K HOOOOO K B 

nnaaaBB 
naaaoaanai 

HKKJOtKKSKK 



noosKiQoar 

NUWMUU(M( 



igoocaaqaqc 



1 0000000001 
KKSKKKKKCT 



IKlUDilQOQQI 

Boooaooag 

IWHOOtJOTW 

mmomi 

10000300031 



10300000031 0000000031 1 0300000 0 3 1 

aoooo a o M ■ra a ro a a gaooaoooai 



BOOOOOOOOI 
10000300301 
10000000001 
10000000301 
13000300301 

Hgo oaooo a 

13000000301 



13O0O00O3OI 
BQ0030000I 

nooooooaoi 

10000000301 

nooooooaoi 

BOO OO O OO OI 



W OOOOOOOOI 
13000000301 



B100D00003I 

nooooooaoi 

TGACrCTCCA 
rTCTTGGTAT 
TCCCCTTTCC 
AGCACAAAGA 
ATTGAAGGCG 
rrCATCCCTT 
GAATCTCTXC 
CTCATGCATC 



Boooooooa 

103 0 00 00 0 0 1 
CATCCCAACT 
CTATCTGCAA 
ATAGTAGACC 
CTCCAAGGTA 
AGCTCTAAGC 
TtTCTCCCAC 
CCTCTCrCTT 
X AATCCCACC 



CCCACTACCT 
ATGAGAGCTA 
ATGCCAAAGA 
GAGCTATACT 
rCAACACACC 
AAAGTCTTAT 
TAAAACTACA 
ACTTTGGGAG 



BOOOOOOOOI 
BOOOOOOOOI 
BOOOOOOOOI 
GCACAAATTC 
TAACtXACTC 
AACTGAAATC 
GAACCTTATC 
ACCACTTCCC 
rCTCAAEGCA 
GCCTTGGCCA 
GCCAAGCTCG 



t MMKXKX n 
BOQfTCTCCT 

tcacctccac 
tcattcaaac 
tgaattcaaa 
taggggaaag 
agaaagcttc 
gcaga7acat 

CGCACAGTGA 
GAGCATCACT 



10151 
10201 
10251 
10301 
10351 
10401 
10451 
10501 
10551 
10601 
10651 
10701 
10751 
10801 
10851 
10901 
10951 
11001 
11051 
11101 
1115] 
11201 
11251 
11301 
11351 
11401 
13451 
13501 
13551 



tgaggtcaag 

CTACTAAAAA 
TCCCACTACT 
CCACCTTCCC 



ACTGCAGTCT 
TCTTGGArCT 
TCACTAAAAT 
TAACCTCTTT 
AAACAAAACA 

cccccTcrrc 

CTCTCCCTAT 
TTGCTGCGCT 
TCTAATGCCC 
ATCTTCTAAC 
AACTCCTTTT 
TGAAGCATTA 
GGCCTCAAGE 

ccttctgaat 
aaggaattca 
accttatcta 
gtatttacta 
gagtcactta 
agcgtgctaa 
tctcacaata 

TTCATCTATT 
CCACTCTTGC 
rrCCTCATCT 
CCTTTIATTT 



ATTTCAAGAC 
TACAAAAATT 
TGGGAGGCTG 
GrCAGCTCAG 
TCTCTCCGCA 
CACCATCCCT 

rnecTiici 

TTTGCTCTAG 
AATTACATAA 
ATATAGGCTT 
CATTCACTAC 
CTCACTTICC 
GTTATGAGGA 
ACCATTGTAT 
CATTCAATCA 
AAACCCAAAT 

gccttggtaa 
accaaatatc 
ttatgttcat 
cttgctctct 
ttcactttca 
tcatacttca 
ttcagttact 
tcttataagg 

TTAAATACAT 
AAATTGCATT 
ATGACTGOCT 
ATCCTTCATC 
CTCAAGGAAA 



CAGCTCCGCC 

ACACATGAGA 
ATTGTGCCAC 
GCCCCCAACA 
ATTCTTGTTT 
TTTTGCACTT 
AGTTTGGCAA 
rTCCTATTTA 
GTATTAGCAA 
TTAAATATCA 
rCATCTTTCA 
TTAAATGAAA 
GAGTGACAGA 
ATCGTATCAA 
CCTAATCACC 
GTATTTCCAC 
ACCCCACACA 
CATATAGTTT 
TACTACTCTG 
GTTAACCTCC 
TTATTGAGAA 
ACATAGC7AG 
CTTTGAAATT 

GCACATTAAT 
GTGGGTCATG 
TCATCTAATT 



aacatgctga 
gctagcatct 
atcgc7tgaa 

TGCACTCCAG 
ACAAAAAAAA 
TCTTTATCCT 
CCTTTATTTC 
TATTCTGTCA 
TGTTAAACAA 
CCACTTAAAT 
rCTTGGCGAA 
CAATAAGGAT 
TACATATTTT 
TCATGCATCA 
TTATGTATTA 
ACTCTGGCAA 
ATAGTATCAT 
CTGGACArrC 
ATGGATTGCT 

agc7agggag 
acagcaacac 
aagtaaacct 
taactggtca 
tattagactt 

GGTAAGCTTC 
ACGAGTACTA 
CTTACTCCAC 

aaatggcata 
gctctagacg 



AATCCCATCT 
AGGCCTGTAfi 
CCTAGGAGGT 
ACTAGGTCAC 

TCTCTCCTTT 
CACATCCGAG 
GCAGATAAAC 
GACATCAATG 
TTCAATCTTC 
CTCATTTAAT 
GAGACTCACA 
TACCACTACA 
TGAGCCTGGA 
ATAAACTTTA 
TACAAGATTC 
TCATAGACCT 
rTGTCTCTTT 
TTGGAGTGGA 
TAG C TTC6CT 



CAGGAAGATT 
nTCAGGATT 
TGAAACTGTT 
TAAATTCACC 
CTTTGATACT 
GCTGCCTCTG 
AGGTTTTCTB 
GACCCAAGTG 
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12751 
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12851 
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12951 
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cttccaccat 
aagcataut 
aaagegggaa 
tattagctat 
aatttaacca 

TTTCTTArCT 
TCTTACATAT 
GATACTTAU 
TACAAACCfA 
rOCACTTTC 



CCICGCCTAC 
TGAGC7GAIS 
ACIGCTCTCT 
GTGTTTTtTG 
ATCTACACC7 
fCTTTCACTt 
ATTT7CACA7 
TACTAACTCC 
CCCCAAACTt 
TTAAAAACAA 
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AACATCCTSA 
GtTGCTAACT 
CAAT BGACAG 
ACTTTCTAAA 
CICTAAAAAA 

rBCTCCTCre 

CC BC O CT CW 
ACAAAAACCC 
AACAAAAACC 
K30OEOTCT 
19000000001 

aaooooaaai 
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( 3 000 0 0 030 1 

Ho eo o o aoa i 
Boooooaaoi 
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SMOOOOOOOI 



13053 
13101 
13151 
13201 
1**n 

13301 
U3S1 
13401 
13453 
13501 
13551 
13601 
13651 
13701 
13751 
13801 
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14351 
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14451 



SB00003SV kkhqsxsbsh b&mu&kmj tsooaoaaooj saaooooaai 
naooooaoat bkhods 



101320000031 10000300301 



ESS53ECOT HKOOOUOOa BSQQ3Q03C1 



bmmoqoow loooonoooa l eaaiaoaoai toaog a oaooi mmmuwui 



KHOTKHoai pao o oaaooi 



■ a a aoaa aar 
■ a oB Ba oaai 
naQaaoaaor 
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10000030001 
BBOBOaOOJ 

ACAGCrccrc 

rTTGAAAATC 

CAcreeCAGC 
ggctaaaccc 
aactctjg6tc 
catctttaac 

CTCACTGCGC 
TCCCTTATTT 
CCATTAGATC 
TT7GGTCATC 
TAACTGGGAC 
AGCACATCAA 



BMautiouoa 



KKJOCQOTCKB 82000000001 
MiMAKMtW lOKtt&WWfl 
10303000001 10000300031 



ATATTTCTTC 



BgOOOOTO 
TCCACACTK 
nCTEACTTT 
GTACCTAGCA 
CTTC7AATTA 
TCTTTGAAA6 
TtACCtATTT 
ICBETGTATT 
AC7CATG7GA 
ATAAAAACAC 
CTCCTCCTCA 
AAGATTTGCA 
CTCGATGTCT 
ACTGCCAGAC 
ACATTTTtTA 



GTCAGGC7TT 
CTCTCTTTCI 
ATTTTAAAAT 
ACTTTGGAAT 
rgCTCTTCT 
CAGACTTATG 
ACCAIATATS 

nrcccncc 

I C rTTO C TT C 



TCATGTTGCT 
CCACTCAGCA 
CTGGATATAA 
AAACAGGTCA 
ACTTCTTTAT 



Boeaoooooi 
jocoooaaaoi 

10030000001 

Roooooaaoi 
looacooQoci 
lacaoooooQi 

(3DOO3QQ00I 
CCTBSBBECA 
CACTTTTGAA 
GTGGCTATCC 
TCAAGTTGGA 
GTCAATGATT 
CTAATAATTG 
CACTTTTCTC 

rocraxcre 

C76CTBCCTC 

CTTGACCTCC 
CACAAGCCTC 
TCATGAAATG 
GTGGTGGGAC 
TAACACA^TA 



10000000031 
8 2000000001 



aooaoaaaoi 

82000000001 
ROOOOQ030I 



B CTCCCTB CA 
TttAAAOCTT 
TCGTT6AGAG 
GGAACCCCTG 
GCTITAATGC 
AAAACCCGCA 
CATCCTCCTT 
TCTAACCACA 
AGGCTTCTCT 
ATCAAOCCAA 
rCTGCCAGGA 
GAGCTCTATG 
CGCTTTCAGC 



TCCCAACTTT 



14501 
14551 
14601 
14651 
14T01 
14751 
14801 
14851 
14901 
149S1 
15001 
15061 
15101 
15151 
15201 
152S1 
15301 
15351 
15401 
15451 
15501 
16551 
15601 
15651 
15701 
15751 
15801 
15851 
15901 



CTCTTC7AEC 

aaatca7att 
ttcaaactca 
gaatcac7ac 
raxArtm 

GAGAAAGAAT 
AGAAATTTGA 
TTATCAGAAT 
TTATTATTAA 
CGAATA7G0C 
AAACA7TCA7 
ATGCACCCGA 
CAAAACTTU 
rCAATGCCAT 

accttcacca 
cac7c7catc 

CTTCTTACGA 
GGCACTCAAT 
CTTCCAGGAA 
7A7A7GCA7A 
GCTTATAAAC 
CCAACTtm 
CTTCATT&GA 
CTAAAAAAAA 
CAAAATGCAA 
TTAAAACCT7 
ATCACTtAGG 
ACCTCCCAK 
ATTACAGGCA 



ACCCATSATC 
TtACCSCTTC 
GCCCTCAGGG 
ACAGGTATTT 
GTACTCTGTC 
CTTTGTTATT 
CTTGAGTCCI 
rTTTA7ATA7 
7ACAAATATA 
1TTTTGCACA 
GCAACATAfiC 
AAGAATATTC 
AATAGATTAA 
AATCCAGAAT 
C7TACTAGCA 
7ATAAAATCC 
TCACTCCCTT 
AGATGTTAAA 
TTTTCCAAGA 
CATCCACA7A 
AATTTAACGC 
CTGCATtATT 
GTAAACAGCA 
ATCTTTACTT 
ACTCTTACGC 
nCAATTTTT 
CTGGAGCGCA 
nCAGGCAAT 
f TCCCArC AC 



CTTATCCAAA 
TACAGTTTGT 
C7ACCGCTTC 
CTTCCCTTTt 
TCTCTAGAGG 
AATGGAGC7T 
RATAAGACT 
ATA7ATA7AC 
AC7ACGCACT 
GCCATTCCAG 
AATGGAGACT 
ATTCAAAAAC 
CCTTTTTAAC 
ACTTATCAAC 
CTT ACT ACTA 
A^bA^^T AAAAA 
AA7ATATATA 
rCTAAAGTAT 
CACACCAACA 
&ATATTATAA 
ATAAATGECC 
CCAtACACAC 
ATGCATTTCG 
ITTAAATATT 
AAAAIATTAA 
TTTTTTTTTT 
GTCGTCTUT 
TCTCCTACCT 
ACCTGCCTAA 



AECCATATTT 
TCTATXACAG 
CAfiAAGTTAA 
GGTTSCCCAC 
GATAAACCTT 
miATAGAC 
TTGCTTCAAC 
ACTATTTTTA 
TAAfiACTTCC 
TAATAATAAT 

AGTTTTACCA 
CAATTCAACA 
CACTCAATTA 
ACTTACT7AC 
AfiAACCTATC 

aagcatttac 
acttatctca 

acacgacact 
atta7aaata 
aaaatcttaa 
gggaagctat 
atacaattat 
atcttaaatt 

TTTTTTGACA 
CTXAGCTCAC 
CAGCCTTCTC 
1 1 1 1 1 1 1 AAA 



CAACTCACCA 
TGACA7AATT 
CCCGACTGTT 
GTCUIACGC 
AATATGACAA 
ACTGCTXCAA 
CATAGCAGTA 
ITATCGACAA 
ACACAXACAT 
GACAACCTAA 
AAACATGCAC 
AfiCA-TAAACA 
TTACTTCTCA 
ATCCTTCATC 
TSCTTTCTTT 
TCATACATTT 
GACACTCCCT 
AATCTXTTTC 
TACACATACA 
CACACAACCA 
GCAGCAS7TC 
I CTIT T TOT 
ACACAACTTT 
GATUAAAAS 
TA7TCAAAAC 
rGGAGTCTCT 
TACAACCTCC 
AGTACCTGGG 
I ibl I lATTT 



15951 
16001 
16051 
16101 
16151 
16201 
16251 
16301 
16351 
16401 
16451 
16501 
16551 
16601 
16651 
16701 
16751 
16801 
16851 
16901 
16951 
17001 
17051 
17101 
17151 
17201 
17251 
17301 
17351 



fTATTTAGTC 
CTAATCACAA 
lAATGGCTTC 
TAGGATGTTT 
TCTCATAGTT 
CAGGGOXCC 
TTATTTATGT 
AAATCAGAGA 
GGTATTGATC 
ATTTAAAACT 
ACGCATACAT 
CTCCTCTATT 
TAATTAAGGA 
AGAAAAGTTT 
rCATATTACC 
AAACTTUAA 
TTTCTTTTCT 
TTTCTTTCCC 
rCAAGATAAA 
rC7AAACTTG 
ATSTTTTTTA 
rTGATGAGAA 
GAAAfCAGGC 
rTTATTTCAC 
AGATACAATA 
AGCATAA£aC 
rCTGCCAAGG 
AATTATGTAA 
AAACOGATTC 



AAATA7ATCA 
AAAGCCATTC 
7TAGA77AGA 
AGC7ACATCC 
ATAGCAACCA 
AGTTGAGAAC 
ITTCII G IAG 
TXTAAACTCC 
TAATTTTTCC 
TTATXTTTTC 
€TACTCCTAT 
CCATTGATCT 
GATTT7CTGG 
TAGCTTTGCA 
CATG7AACAA 
rTTTTTAAAT 
rTTn7I C I C 
rCTCTTCTAT 
ATAT7AGAC7 
AAATtATCTT 
TTT G T G IT G T 
AATAAACCAT 
AAGATTTGAA 
TTTGCCATtA 
ATXCAGGAAA 
rtCGAAGAfiG 
TAAAICTTCT 
CTAGGTGGGT 
AC7AAATTTA 



ATATTT7ATT 
TtTATTCCAG 
TAAGTCCTTG 
CTGACATXTA 
TAAATAACTX 
CACTGCCCTG 
TACTTGTATA 
ATTTAGAATT 
AAArGTTTAT 
AAGTGATTTG 
CTGTTTTCGA 
ATCTACCAA7 
CTTTTTTCAA 
GAAAAATTCA 
ACCTCTACAT 
ACT AAA T AAA 
TCAGCTCCTT 
TTCATTCCAT 



AATTTTATTG 
rTATCTTTOt 
CTGTGAAAAT 
GCTATTCACT 
CTTTWTAAT 
GAAAGAAATC 
AAGTACCAGG 
AAATTTCfAA 
AAGTGGGAAT 
ACTCTACTTT 



TTATTGtATC 
GGTTTCTCAA 
nCTCAAGAT 
CCCACTCGAT 
CAGACATTAT 
TACCCAGGTT 
ATTTCATTAT 
TATTCCTATA 
CCACTTGTCC 
AGATAACCAT 
TAACAGTATA 
CTACCAGAAT 
CATTAATAGA 
GCAGAAACTA 
6TACCCCTCT 
TATTACCTCT 
CAATTATAAA 
ITTATTTAAT 
AGAATAATTC 
CCCACATACT 
AGCTTTAATC 
TAGATCTATT 
4ACCATGECT 
TGGAAAXTAT 
CCTTXAGGCT 
ATTTTCTGGA 
GCCTGCTCAA 

rrr t wrrr x 
ooww wivi 

GAATTUTGA 



rG&ATTTTTA 
CCCTCAGCAC 

ctctgcattg 
gtac7agacc 
tgaatgttxc 
gtagagaaaa 
tttxatatti 
7atggtgtga 
catcaccatt 
cacattxtaa 
tttggatgtt 
cacactcttt 

CCTTATTTTT 
CAGACACTTC 
ATCTAAAATA 
CTTCCATATT 

tatattccca 
aacttttccg 
ctcacttgca 
gatcgaaact 
aaaagtccc7 
taaacgtctg 
tgctttataa 
rnrcTACCC 

CCBCTAAACC 
TA7TCTCCTT 
CTGACCAGTT 
GACAAGAATA 
GCAGCTTCAT 




17401 GtAATTTGAC ACAAACACAG AAITCTCCAA CTCTCTCCCT ACAGCAGGGT 

17451 TACTAAAGAC TAAAOGAACG ATTTGACAAG ATTTGAflGAT TGTCATATGG 

17501 AIACATGGA7 TTTAGGGCAT CATGAAAAAA TCETUCATS GATAAACGTA 

17551 AAAATTATGA TCATAAGGTC CTGCGAAATC rGGGAGTTTG AAGACAATTT 

17601 CTAGGGtXTG TTCATCGAGC GCeCTTTCTC CAACGCCTGC IIIIUIATC 

17651 7AA0CTTGCT TCTCCTTTAT GCTTTGGGCA CAATATGETT TATACCACAT 

17701 ATTTGTTGAA CTCAATTAAA ATTTAAACCC CTATTTAAAC CTCTCA7TTT 

17751 TCCCCTCAAA TUnATTCT CCTTCTA7C7 CCAAACATTT ATAAACTCCC 

17801 A7TTTA7TTA AAATATTTCT ATTGTACTTT CTAGCATCAA ACTCCTACCA 

178S1 GCTTCTCACA 7ATTGATCTA CACTCTGAAC TGAGCACATT CCTGTTGGCA 

ITWH GGACATGACA CCnCCCAGC AAGCATCTGC TCCATCCTTT ACTCCCTOCC 

17951 ttTCAACCCT CAfiCATCAAC AGAGATGCCC GGAGCAGGTC AGGGGCATCC 

18001 TCCGGGATGC CTCTTCTAIt ACTTGCTAAC ATCTGCACCC CTAAATTTTC 

18051 CTGCTAGTTT TCCCCCTCAC AH MUCH I ATTTTTTGCC CTGGTACCTT 

18101 AGTGACCCTA CTGCCTCACC ATATGTCTAG CTGAAACAGA AGAACTAGGC 

18151 TA ll I I I LIl TTCT7TCTAA AGAGAGCTCC AAATTAITCT CTTCTtTTTt 

18201 AGGAAAAAAA AAAAACTTTA TTTATCCATA AATTCTCTGT CATTGCTTTT 

18ZS1 CTAATCAATG CTCTCTGAAA ISTCTTATTT CTTTATTTCA CCTTGECTC1 

18301 GATGCATTGG AAATGAGGAC T7UTCCCTG GGCTGGCAC7 TAGAACTTAA 

18351 ACAA7AGGCT GCAAGTGGAC CT tt T t nCT GACAGAGCTG AATGATTAGC 

18401 TGCATTATTT AAGGCTtATT TTACACAICT CCCAGCCGCT TCTCACCAAT 

1S451 TTTA7TCCTC ACCATTCATT TTAUCTTCA GACATAATA7 TCCATGATA7 

18501 ATACTATACT TAAGTTTAGC AAATA7GCAC TGAGCACATT TTAAATACTC 

18551 AGACTTTTTT TATGACTACA ATTTATTGTG UCtXICtCI TCGCTGAGCT 

18601 AATGGTCTAA TACAEGAGAC ACCAGA C ACA CCTCCAAATT GCAGTCTAGC 

18651 ATAATGACGC CAATGATAGA GATATGTGCT GCCTAACACA AAGACATAGA 

18701 AGACAGGTAC CTACCCTG GC ATGGGAGCTC AAGGAGACTT OCTTGACATT 

18751 tacgctgact gcaggataae TAGGAGTTAG CCAGCTGGAA ACTGTCATCT 

18801 CTATCTTCCT AGACTTTAAG CATATACTGC TCTTAAIAAA GCCCAGGTTA 
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18851 rGCTCTTTGC AAAGATAAAA TGTGTTCCTG ACATAATACT GCTtAAAGCG 

18901 ACAGAAAGAC AGAAATGCTA ACGACAATTC AGCAGCAGAC CAGATAAAAA 

18951 ACAGCATATT TCATATCCAA AACTCAACTC AATTGAAACA TTTCTAAAAC 

19001 CAAATTTGAC ATTATAAAAG TATATCAGAC ATCTCATTTT ataacgaaat 

19051 agaagccctt tcctaccata aactaaagat ttaatctata tagcacaaaa 

lflol tacmtcttg agtaatcatt tttaatttat tttttaactc acaaaaattg 

19151 tccatataca tcttatatat atatgtatct gtgtatatat atatgatcta 

19201 caacatgata ttttcatata tctatacact ctgcaatgac taaatctatc 

19251 aatgcacatg ttcattaact catacttatc atttttttct gctaaggaca 

19301 tttaaaatct accctctta6 caattttcaa ctatacaaat tcttactaac 

19351 TCCAATCACA TATT6TACAA TGCATtTCCT AAACTTATGC CTCCTCTCTG 

19401 ACTGAAATTT TGTATCCTTT GACTAACATC CCTGTAATCC CCCATTCTCC 

19451 CACAGCCCn CCTAACCACT GTTCTACTCT CTGCTTCm CAGTTTAATC 

19S01 TTTTAGATTT CCACATCTCA GATtATGTGG AATTTCTCTT ICTETGCCIG 

19551 GCTTATTTtA CTTAGCATAA TCTtATCCAA ATTCATCTCT GTTGTCAIAA 

19601 ATGACAAGAT ATTTGTCTTT rCTATGGCTA ATTGTTAGTC CATTCTTTAI 

19651 AIATATACCA rCTTTTCTTI ATCCATTTAT CCAGTGATGG ACACTTAACT 

19701 TGATTTCTAT ATCTGGGCTA TTGTGAATAA TGCTQCAATG AACATGGGAA 

19751 TGTAGATGTC TCTTCAATGC ACTGATTTCA TTTCGTTTGE TTGTATATCC 

19801 AGAACTGGAA TTGCTGCATC ATATGCTAGT TCTATTTTTA ATTTTTTGAC 

198S1 GAAACTCCCT ACAATTTTCC ATATGGCTGT ACTAATTTAC ATTCCAACCA 

19901 AAAGTCTATA AGGCTTCTCT TTTCTCCACA TCCTCACCAA CATTTCTCTT 

199S1 TTTGCTAATA AOCATTCTAA TGAGCATGAC GTCATCTCTC ATTATGCTTT 

20001 TAATTTACCT TTOXTGATG ATTACTGATG TTCAfiCATTG TTTTAAATAC 

200S1 CTGCTCGCCA TTCATETCTT CTTTGTAGGA ATETTATTTT AGGTTTTTCT 

20101 CATTTTTAAA rCTAETTATT I G tllTCIT G CTTTTEAATT GTCTGAGTTC 

20151 CTCATATATT TTGAATATTA ACCCCTTATC AGATGTATCA TTTGCAGACA 

20201 rGTTCTCCCA TCCTTTAAGT TGTCTCTTCA CTATGTTGAT lETTTCtTTl 

202S1 GTTCTGCAGA AGCTTTTTAG TTTGCTGCAA AACCATTTAT CTATTTTTTC 



20301 TTCTGTTGAC TATACTTCCA GAGTTGTATC CAAAAAATCA TTGCCAAGAA 

20351 TAATATCAAG AAGCTTTTCT CTATGTTTTT TTCTAGTAGT TTTATACTTT 

Z0401 CAGGTCATAT GTTTAAATCT TTAATCCATT TTTAGTTGAT TTTTGTATAT 

20451 GGAGTGAGAT AAAGGTTXAC TTTTATTCTT CTACTAGTGC ATATCCAGTT 

20501 TTCTCAACAC CATTTATTGA AGATACTGCC CTTTCACCAC TGTATGTTAC 

20551 TGGAACCTTT GTAGATCAGT TGACAATAAA T C r G TCGGTG TATTTCTGGA 

20601 CTCTTTATCC TGTTTTATTA GTTTATATCT CI C IIII IH AGAAGCTCTA 

20651 TGCTGTTTTG GTGACTAGAG CTCTGTAGTC AATTTCAGAT CAGGTAGTAT 

20701 GATGCACTCC AGCTTTCCTC TTTTTGCTCA AAATTCCTTT GGCTATTTGA 

20TS1 GTTTTTTTAT TGCATACGAA TTTTAGGGCT rTTTTTTTTT TTCGATTACT 

20801 GTGAATAATC CCATTGCAAT TTTGATGGAG ATTGCATTGA A TCTTTCCGl 

20851 ACTATGGATA TTTTAACACT ATTAATGCTT CCAATTAATG AACACAGGGT 

20901 ATTTTGCAAT rTGTGTTTTC TTCAATTTCT TTtACCAGTC T TTTTTTCTl 

20951 AATTTAATTG TTTTATTTCC ATAGGGTTTG GCTAACAGCT GGTGTTTGC7 

21001 TATGAGTAAC TTCTTTACT6 GTGATTTGTG AGATTTTGAT GCACCCATCA 

21051 CCTAACCACT ATACACTCTA CCCAATTTCT AGTCTTGTAT CCCTCACCTC 

21101 CCTCCCACCA mCCCCCAA GTCCCCAAAC TCCATTGTAT CATTCTTATC 

21151 CCTTTCCATC CTCATAGCTT ACCTCCCACT TArCAGTGAG AACATATAAT 

21201 CTTT CCTTtT CCATTTCTGA CTTACTTCAT TTAGAATATT CCTCTCCAAT 

21251 TCCATCCACA TTGCTGCGAA TGCCTTTATT rTCTTCCTTT TCATGGCTGA 

21301 GTAGTATTCC ATACTATATA UTCCCACAA fTTCTTTATt CATTCTTGAT 

21351 rCATGGGCAT TTGCACTGGT TCCATGTCTT TACAATTCCC AATTCTGCTG 

21401 CTACAAACAT GCACGTGCaA CTGTCTTTTI CATATAATGA CTTCTCnCC 

21451 TCTGGGTAGA TACCCTGTAG TGGGATTGCT GGATCAAATG CTACTTCTAC 

21501 TTTTAGTTCT TTAAGGAATC TCCACACTGT TTTCCATAGT GCTTGTACTA 

21551 GTTTACATTC CCACCAACAG TGTAGAACTG TTCCCT G TTC ACTGTATCCA 

21601 CACCATCATC TATTATTATT TGATTTTTTG ATTATGCCCA TTCTTGCAGG 

21651 AETAAGGTGG TATTGCACTG TGCTTTTGAT TTGCATTTCC CTGATCATTA 

23701 GTGATCTTGA CC ATTTTTTC ATATATTTGT TGGCCATTTG TACATCTTCT 



21751 TTTGAGAATT GTCTATTCAT UCCM I UC CATTTTTTGA TGGGATTATT 

21801 I GTTTTTTTC TTGCTAATTT GAGTTGCCTG TAGATTCTGG ATATTAGACC 

21851 TTTGTTGGAT GTGTAGGTTG TCAAGATTTT CTCCCACTCT TTGGGTTGTC 

21901 TGTTTACTCT GCTGATTATT TCTTTTGCTG TGCAGAAACT TTTTAGTTTA 

21951 ATTAAGTCCC ACCTATTTAT CTTTTGGTTG ITGTTCTTTT TTGGGGTTCT 

22001 I HG II lia C rT EG TTTI G CATCTGCTTT TGG 6 TTCTTC GTCATGAACT 

22051 CTTTGCCTAA GCCAATATCT AGAAGGGTTT TTCTGATGTT CTAGAATTTT 

22101 TATGGTTCAG GTCTTAGATT TAAGTCCTTG ATCCATCTTG AGTTGATTTT 

22151 TGTATAAGCT GACAGATGAG GATCCAGTTT CATGCTTCTA CATGTGGCTT 

22201 GCCAATTATC CCACTACAAT TTGTTGAATA GCCTTAATAT TTAAAGCTTT 

22251 ATATATTTAG GTGTTCCTAT TTTGGCTACA TATTTATTTA CAACTATCAT 

22201 ATCCTCCTGA TGCATTGACC CCTTTCTCAT TATATAATGG I CTTCTTGTC 

223S] TCI ITHACA 6TTTTTGTCT TAAAGCCTAA TTTGTCTGAT AAAAGTTCAC 

22401 CTACCTTTGC TCTCTTTTGG TTTCTATTTG CATGGAATAT TTTTTTCCAA 

22451 CCCTTCCCAT TCACTCTATC T G T G TTCTT A AAGATGAAAT GAGATGCTGT 

22501 AGGGGCATAT GCTTGGCTCT TGTTTTATTC ATTCATTCAG CCACCCTTTT 

22551 GATTAGAGAA TTTAATTCAT TTCTATTCAA CGTAATTATT GACACACAAG 

22601 GACTTACTAC TGCCATTTTG TTAATTCTTT TCTTGATCTT TTATAGATCT 

22851 mCTTCCTI TCATCCTCTC TTACTCTTTT CCTTTCTCA T TAGCTCCTTT 

27701 TCTCTACTGC TCTACTTTGA TTTTTACTTT TTATCTTTTC TTGCTCTACT 

22751 ATAGGTTTTT GCTTICTCCT TACCATGAGG CTTACATAAA GCATACTTAT 

22801 AAAAGGCTAT TTTAAACTGA TAACACCTTA ACTTTCAACA CTTAAAAAAA 

22851 CTATACACTT TTACTCTACC AACTGCCCTC CATTTTAIGT CTTTCATGTC 

22901 ATAATTTACC TAGTTTTGGA CA TGTGT CCt CTTATTGTGT ATCCC1TAAC 

22951 AAATTATTCT AGCAACACTC ATTTTTAATA GTTTTGGCTT TTAACTTTAT 

23001 ACTAGAGATA GAATTAATTA ACATACCACC ACTACATTAT TAGGCTATTC 

23051 TAAATTGACT ATGTATTTAC CTTTATCACT GAGATTTTTG TTTTtAATTT 

23101 TCATCTTGTT AATTAGTATT CTTTCATTTC AACTTGGAGA ATTCACATTA 

23151 GUI 1 1 [III TAAGATGGGT CTAGTACTGG TGAACACCCT CAA CTTTT61 
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Z3Z01 micro* atctctttac citictnt * ttttcaaata m crnra 

23251 TCCATGATTt AAATGCACAA AAITCTTTT1 rTAATTAIGC AAAGTGCCAG 

23301 CCTAAGCAGA ATTACTCTTT rTTTTTTTTl ntAGAOXA GTTTCACTCT 

23351 ICTTGCtUC GCTGGAGTGt ACTGGCGCAA rCTCTtAGCT TACCCCAACC 

zMoi rcTBCcnrc acsttcmgc gattcttctc cciaarn gctcagtacc 

23451 TGCGATTACA GGCATGCACC ATCATGCTCt GCTAATTTTC CATTTTTACT 

23soi wurnrmx rrmcuK iiuicaggc icncmu caccccacct 

23551 CACATCATCC CCCCACCTA6 GCCTaXAAA CTECTCCCAT TGCACCTCTC 

23601 AGCCACTCCC CCTGGCCAGA ATTACTOTA TTTATCCTGA QCTTCACGAA 

23651 CAAAGAATTC AAAATTAAAA TTTCACATTA CCTAATGGCC AAACCCTCCA 

23701 TTXAAAATAA CTAATCAGAA AAACATATAA AAACACAATA AGATAAACAG 

23751 ACTAAATAIA TGCAGTCATT TTATGGAACC AATCTtACTA GATTGGATGC 

23S01 AGACTAGGIA GGATGCAAAT TTAAAAAAAA CTTTATTCTT CTTtXAOTA 

23853 TAAACTTTAA ACCTGCTTTC TGGAGCAACT TCTTTTTATC TCTGGGGAAA 

23901 GATCCTGACT AACTCTCATA GAGTTCTCAT TCATTTAAAT CACAAGAACA 

239S1 ATtTTAGGTC AGTAATTAAA CTATCTGGCC CA6TCTAATA CTGAAACTTT 

24001 CAAATACTTA TCCACTTGAG CT t n C rrT X CATCCCAGCT TCCTACTTCT 

240S1 fTGCTCCTAG AAGCCAGCAG TGETTTATCA TCGACTTAT7 CTTACTCACT 

24101 AGCTCCCCAA TACOCACTAC CTCCTSTT T C IGGCOCCIC C AGGAATGCTT 

Z41E1 TTAG&AGGAA AGGGGATAAG GACTAAAGGC CTGGTACTAT TCTGATCATC 

24201 CCAAAGGCCT TCCTCGATAT TCUICCTTt CCT Tt tTCTt AACAC G AAAt 

242S3 ro e CTTTCT T CCAGACTCTC TCACTACAAC TTTCCACAGC TCATTCACGC 

24301 GACAACACAA TAAITCTOCT TAGGCAGACT CTTTTTCAAC CTGCTCCCAC 

I43S1 ACCTTTCCn CTTGCCACTT AATTGCTTTA ACGACACAGT TCCACATCCT 

2440) rccCTTBXT c r uc i tc r t r ccr c i u xn i c i t m i rr ctgacttata 

24451 GCCTTTCACA TtACTtCTCT ACTCCCCAAA CTCCAAGGAC CACAAGTCAG 

24501 ATCATCTAAG TGArCCTCTT GAAGCCTCTT GTTTAAGATC GGGGAAGCAC 

24551 CCTT CC Trn CCATGGCACT CTGGCAT7CC AACAACACTT TAAATAATTT 

24601 ITTCTCTCAA AATTCTTAAG Ct TCT C CTCT TTAATCCTTC GCCATTTTTA 



24651 TGTATTATTA CTTTATATCA TGAGCTAACA GTTACAAAAC TGCTTTTTAG 

24701 AAATCTCCTT AGCAAAIGTT TTACTGCTAG TTTAGtAGCT CACTTTAIAA 

24751 TAAGGATAIA TCATATATTT CT TTGCTTtt t C TGCCTCT G GGACCTCAGC 

24801 TCATCCTGAS GCAGAGAGTC CCATTTTAAC ATTCTCTTAC ATAAACCACT 

24851 GCCAAAATGG CTTTAACCTC AGGCTAATAA TTACCAGGAA CAAACAGAAA 

24901 ACACAAAAAA ACTAAACTGC fTATGATATC TCACTCCCTT CCtTXCCTCA 

249S1 TCCTCACAGG CACCACCTtG CTCACATCTC CTACACCACA ATCTGCATCA 

2S001 ABCAGAOGTC CCCATTUn CCTCCAGTCC CCTCCATTTC CACAGATCTt 

25051 ACCAACCCAC TTA C CTTCCC ACATGGATGC ACATTGCCTG CAGCTXTTTA 

25101 CATTCTTTTC CTAACCACTT CTTAGAGGCT ATGBCATCCT GCACACCACA 

2S1&1 GTGACAAAGA TTAGTGACTC TCTTAGCACT r CG AGAA C Tt AAAACATAAT 

25201 GCTAAXATGT GACTTACGTT ITATCACCTA TGAGGAGCTt AGAGGATAAT 

2S2S1 GCTTTGGTCA GACATCAATT TCAAIGACTT TCCCAAAGGC ACATAGCCAC 

25301 TTCCAGCAAA CCTAAGCCCA GAATCCATCT CTTTOCAATC CCACCCCAGG 

253S1 CTCTCTTtCA FTCTGGGACA TCATTTCTAA GATAATCTTT CTTTGGCtCA 

25401 CTTTGAGAXt CAGCTGAAAC TTCATGGAM ATAGCACCAG CATCTTTATC 

25451 TGAAAGACCA AGGGGGATCT TTGGCCTCA7 CATCATAATA TCACCCTTAT 

25501 AAATATACAA CATTTAATAS TTAATATAGA GCCTTCAfiAC CCATTATCTC 

25551 ATTTTTCCCC TTGGAATCCA ATCTTAACAG ATGCTTATAC AATGATTTAC 

25601 AGTTCACTGA ACACTT7TAA CTACTTTCAA ICTGGCOAA AATCCAGACG 

25661 CACCCCCAAT GTCTAGATGA CAtTAACTCA KTGAGCACA GC7ACAACTT 

2S701 CrCCGGACAC CCTGACTCTC GAGCCTACAG ML It UL AA CAACACAGCT 

2S7S1 TTCTGAGCAC GGCTTATAGG AAGCACACGG GTUTCTCAC ACATA7TATC 

25801 TCATTCAATG TTCTATTAA7 TtATGTCTTA CCAACCAAG C CAACAGGATT 

25851 GCTTCTGGCA AACACCTACA GCCTG7TAC7 GTAACTTTGC TCACACACCC 

25901 AGAATTAAT7 TCTGGAAGCT AGAATTATTT CTGGAAACCA AATAACCCTt 

25951 ACATTCTtlt 1 C 1M1UH TKIACTCTG7 TTCTCCCCAA ACCACATGGA 

26001 7A7TTGCCAA AATTtTCCAC TTTOCATATG TGAATAGCAC CAATGGAAA7 

20051 nCTCATGES ATCTGCATGA CAGAATCACA GHC I tlCH, I E T G T C I G T 6 



28101 CSTTTTCCTC TCAAGACAGA GTC7TGCTA7 GTAGCCCAGG CTGGAG7ACA 

26151 GTGGCGTAAT CTCGGCTCAC TttAACCTCT GCCTCCCAGG TTTAAGCAGT 

26201 rCTCCTGCO CA6 C CTCC CG ACTAGCTCGG ATTACAGGTG CACACCACGC 

26251 CTGGCAAATT TTTGTATTTT TATTA6AGAT GGGGTTTCAC CATGTTGCCC 

26301 AGGCTAGTCT CAAGCTCCTC ATtTXGAGAC CAfiCCCTCCT CAGCCTCCCA 

263S1 AAGOGCTGGG ACTACAGCCA TGAGCCACTG CACCCAGCCA GTTCTGTG C1 

26401 TTTATACCTA AATTGTCTCC AGGACTGCTT AATAGTCCAT TAATAGGTAI 

26451 TTAGGCCAGG CACACTGGC7 GACGCATATA ATC0CAATAT nTGTGACAC 

26501 CAAGGTGGGA AGACTGCTTG AAGTTAGGAG ICTCAGACTA GCC7GGGCAA 

26551 CATAGGCAGA CCCTCTCTTT ACAAAAAAAA AAAAGACAGA CATAGCCAGG 

26601 CATGCTGTTG CATGCTTG7A TTCCTGCCTA CTTCGEGGAC TGAGGCAGGA 

26651 GGATCACTTG AGCTCAGAAG TTCAACCTTA CCCTGAGCAA TGTTXACGCC 

26701 ACTCCTCTCC AGCCTGATTG ACAGGCCAGA CCCTCACTCT AAACAAAAAC 

26751 AAAAAACAAA TATTTAAGTA ATTTCCAAAC ATAGCACAAA ATATAACCAT 

25801 GCTTTATCAC ITTGA7ATGA CACCAACAGC TAC1TAAGA7 ACACTCATGA 

26851 ATTCAGTAAA IT GTTCTGTG GAAAGCTAAG GTGCCAACCC AAGC0GCATC 

26901 TTCTTAGCTC CTCCT CA CTC CTCTCATCAG CTACAGCAGC CAGAGCATTC 

26951 CCAGCAGCTA CCTCTTCCO TCAAGAACAA AAGTCTTCTT TAAGAGCACA 

77001 CTAGCCCACA ACTTCCTCTT TCTCCTGCAG TC7CTTTTAT rTtXCTCCTT 

270S1 rCTTAGGGAT CACCCTGCTT CTTAGTATTT CCCCTtTT t A CCACAACCCT 

27101 GCTGTCTGGA AAAACCCAAA GCTATCATTC TCTCTTCTAC ATAAATACTT 

27151 CCAAGAACTA ATECTCTGCA AGTCACTTTT nXTAGCTAA QCAHACAACT 

27201 GGCTATATAA TTAAGGGAAA TGACA£AAAT lAAACAAAAA TAAACATAAA 

27251 AGCCAAAAGA AATGTAAAAC TAITtTATGT TCTTGAAACA CTCTTGACGT 

27301 GTATCAGTGA H i t [ H UT GTAAGCCACT AAGCTTTAAG ATtTATTACT 

27351 rCTAACAGGA AGCTGGAGf A TATGTCTtTC fAATAATTGG OCACATCATC 

27401 Annum gatttctaas tggatgcaca tccatttcta agtggatcta 

27451 TCTCCATAG7 GAAAATAATA CCACTTCCCA TAGTATTTTT CTTTCCCTGC 

27501 CTATXAGACA AA7CAGCTGT GAAGCTGCAA GGTCTGCAGG TCTGAAGGTA 



27551 CACTGCCCAfi TCTAETAGCC ACGGGCCACA tACGGCTACT GAGCACATGA 

27601 CATCTGGCCA CTTCGAATTG AGTTCTGCTG TAAGTTTAAA ATACGTGCTG 

27651 GATTTTGAAC ACATAGTACC CTAAAAAAAT GTGAAACATT TCtTTTTAGT 

27701 AATTATTTAT ATTGATTACA GGTTGGAATG GTAATTTTTG GTTAAATAAA 

27751 CTCTATTAAG ATTAACTTXA CCTTTTAAAA ATGTGACCAC CAGAACATTT 

27801 TAAATTACAC ATCTAGATCA CAITATATTT CTATTGATtG GTGCTAGGTG 

27851 GTAGGTGAAG AAATGTGTTC ATGTTGTTTG GGGGATGGTG TTGGGG7TGT 

27901 CCTCTCArrr CAGGTCTTTG ACCCCTTGAG GTTXTCTX AG GAGAATTCTG 

27951 ATCAGAGACA CCCCTArCCC TACTTACCAT TC7CAGCTGG ATCAAGGTGA 

28001 GAACAATTTG AACTTGCTGA AAGTACCCAA AGATGTTTAC TTGAGAGTAG 

28051 ITTATTCCrr TCAGCTCCTC AGCTCTATAC ATTCTTCCAC GCAACCGTAG 

28101 ATCTTGGTGC CTATTTGAGC CCCAAAGGAT CACTTAGTTT TACAAAGGAC 

28151 AATCGTATTC TCTCTCACAT CCTTTTTGGC CATGCCTCAA AAGCACTCCC 

28201 ACAATCTAAC CTACTGCTtA TAGGCTCAA7 GCAGTCCACC TTCAAAGCAA 

2S2S1 CAGAAATAAT TTCATGAGTA ACTXCAACTG CCGOCTTGTT ATAGGGAAGG 

28301 CATCATGTTG GAGCCTCCCA GCTCAAATTC TXACAGTGAA CAATTTAA6T 

2S3S1 CTAAACrrCA AAAGTTTCAA rGGCATTTGG TGGAAAAAAT ATCACTTTAC 

28401 TCTCTACrTC AGACTTCTTC TACTAGTATT TT ACT A TACT CAGAAGAAAC 

28451 ATCA T T TTT T CAAGTATCAC TTTCTTTCCC rCTTCTCTTt AGGAACTGCA 

28501 TTCCGCACCA GTTTGCCATC ATTGAGTTAA AGGTAACCAT TGCCTTGATT 

28551 CTGCTOCACT TCAGACTGAC TCCAGACCCC ACCAGGCCTC TTACTTTCCC 

23601 CAACCATTTT ATCCTCAAGC CCAAGAATCG GATCTATTTG cacctgaaga 

28651 AACTCTCTGA ATGTTAGATC TtAGGCTACA ATGATTAAAC CTACTTTCTT 

28701 TTTCGAAGTT AAATTTACAG CTAATGATCC AAGCAGATAG AAAGGGATCA 

28751 ATCTATGGTG GGAGGATTGG AGGTTGGTGG GATAGGGCTt ItTCTGAAGA 

28801 GATCCAAAAT CATTTCTAGG TACACAGTGT GTCAGCTAGA fCTCTTTCTA 

28851 TATAACTTT6 GGAGATTTTC AGATtTTTTt TGTTAAACTT (CACTACTAT 

28901 TAATGCTGTA TACACCAATA GACTTTCATA t ATTT T CTGT IGTTTTTAAA 

28951 ATAGTTTTXA GAATTATCCA AGTAATAAGT CCATGTATGC TCACTGTCAA 
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soci AumtCAA cactalaaaa tcatgtagaa taajlutttt AJUICTUCT 

19051 rtATTTAGCC GACA7TCCAT GCCCTGACCA ATCCTACTGC TTTTCCTAAA 

29101 AACAGAATAA fflBCICTBC ATTCTTTCAG ACTTTTTCCT ATALAITTTA 

29151 lATCTAGAAA ICTAGCAArG TATTTCTATA CAtGTGATCA TTCCTAIATT 

29201 CTT ATTUn nTTTCACTT AATAAAAATT CACCTTATTC CTTATCATTG 

29251 CTTTATGCIA ntfCTAATA TGAATCIACT ATAAITTA7T TAACTATTTT 

29301 CCTTATTCGC CATTTAACTT ATTTCTACTT TTAAAAACAT GCTTGTCAAT 

29351 GBCAACAAAA CCCAAAATTC ACAAATGGGA TCTAATTAAA CTAAACACCT 

29401 TCTGCACAGC AAAACAAACT ACCATtACAC TGAATGGGCA CCCTAIACAA 

294S1 TGGGAGAAAA TTTTTGCAAC CTACTtATCT CACAAAGGCC TAATATCCAC 

2BS01 AA7CTACAA7 CAACTCAAAC AAATCTAUA UAAAAAACA ACOOCATCAA 

29551 AAAGTGGGTt AAGGA7ATGA ACAGACACTT CTCAAAAGAA GACATTTACG 

29601 CAGCC AA AAG AOCATGAAA AAATGCCTAT GETCACTGGC CATCAGAGAA 

29551 AIOAAATCA AAACCACAAT GAGAf ACCAT CTCACACCAG TTAGAATGGC 

29701 AATCATTAAA AACltAGGAA ACAACAGGTS CTGGAGAGGA TCTGGAGAAA 

29751 TA6GAAGACT TTTACACTCT TGGTGGCAGG AGAATCACTT CAACCCGEGA 

29801 GGGGGAGGTT GCAGTGAGOC GAGGTCGOGC CACTGCACTt CAGCCTGGGC 

29851 GACAGAACGA CTACTCCATC TCAAAAAAAA AAAAAAAGGA CACCAAACTT 

29901 CTCAATCTTA ATCTTSTCAT CTATGTGGTA TCTTCCATAA TCTCTCTCAG 

29951 ACAGACrCAT CTTTTGCTGA TATGATCTTA CACTATTTTT IGTTTATAtt 

30001 ATTATAATCT UTTAATTGC AGCAACACAA ATGACAAAAC ACAACTCATT 

30051 rCTCCCCTTC CATGACCTAA TTTGCTTTCA CTCTTCCATt ATCACTTATA 

30101 ACATGATGAT TCTGAAATTC ATCTACCTAA AATCTATATA TAAAAAAATt 

30151 CCTCCCTTGA ATTCCACA7C CTTGGAGACA AACAOCCACC TCTAAAACCA 

30201 AATTTGTTTA ACACTBGAIX ACTCCTCCTG TCTGACTTTC CATTTTGTCA 

30251 CTAJTTTCTC AGCfGGTATA CCAAfATCCA COCACTTAAA CAATATTTCC 

30301 rer tmn ctggtacaaa cccaaataaa mcAAACAT caataaaact 

30351 AAAATTCTAA AATAACTCAC TTTCTCTATA lATCTCCrft ITGCTGGAAA 

30401 AATGGGTTAG CTTAGTTCTT TAAAAGCATG CATGATAAAT TGTACTGAAT 



30451 


ACAATATTXA 


GGTCTGGACA 


TACTAGGTAT 


AATTTTCTCT 


CltltlGGGG 


30501 


rtTTACCTAT 


TTGGGGTCAA 


AATAAACAAG 


TTTATTAACC 


TTATTAATAT 


30551 


ICAATTTCAT 


TATCTTtm 


AACAATTATC 


TTtCCTGCTA 


GTTTCArfGt 


30601 


CAATAATTTA 


TTTGTCAGCT 


TGCCAGGTGC 


TTCTAAJUTT 


CTGTGTAm 


30651 


mCATATCC 


AATTTTACTT 


TAAAJATTTT 


TAGAAAAGAC 


CTnCTfAAA 


30701 


rrnXTAATA 


ATTATTATAT 


TAITGTTTTT 


TCACTGACAT 


TTTGTGAATT 


30751 


GAAAACCCTT 


AAAAATAIGA 


AArCATTTTT 


TCGAAATATC 


TGCCACAGAC 


30801 


AATTTTGTTA 


AATAACAACA 


CACAAACAGC 


GCATTATCAA 


GAGA7AAATA 


30851 


TTCAATATAC 


CTTATATTTX 


TGTCACAUT 


TTTTATACCA 


ACTCTGCCAA 


30901 


AAATTCTATA 


TCATATAAAT 


GATAACAACT 


TCACAAAGGC 


ATTCCTTTAT 


30951 


CCCIIAACTC 


TCAAATTASA 


AACTTCATA 


GCTACGAAGT 


AflEEGAAGCA 


31001 


TATAIICCCI 


TTGAAAGGTG 


CAAGAAAATG 


ICATTGGCAT 


TtACCATGGT 


31051 


ACTCTTCAAG 




AIGGACTGtA 


aaacatttac 




mm 


TATTTATTGC 


GTAttTTTAT 


CTTTACATAA 


ATATTGAAGA 


tatctcacat 


31151 


ACCTCTTTCA 


ATCAGATTAT 
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Count: 
BU 

rnliiu 

267 CCABCCfCTCnAfiCCTCCTAMrATACTBCAAAAACTTCOC MI tCCTl [CTT ACCCA 

T&UUOAaTCUACCCTGCTGGACAGBEE^ 

CGT£A£CCXA££CG TTTXCAt^TCTk\aXTtX£AA£AGAACTtA TT AGA.UXTCAIX* 

ACCACACATACTATTCCTCGCTCTCA 

tT.C] 

GAACMCCCMGCACCGGACCAGGCAAGATATCACAAAGCT&UCTTTCAGCTCTGGGGC 
A£AGCATQMTCTU£nriTTGGCCC7ACCACCAraCCA7UTATUQ£CCA7CATAC 
AACCATtATtATTTGGGKAGCAATACBCCATAGAGGAATCAfATCAAAAGCTGAAATGC 
CATCACTTACCCAUACUtXTCTCTAACCCA&tG&ATTCTGACA^^ 
ACATCTACTTCAAGCTTGCACTT AGCTAGCAfiCIAGKAACTCTQGCAAACAA£CACCT& 

2S4 CCAGCCTCTCTTAGCCTCn'AAATATAflVCAAAAAnTCCACACTTCCTTIVrTACCCA 
TGAAAGCACATG&UCCGT&CTGEACAGGGGCAAnttCCCTGGAGCACABCAETAACTS 
CATAGAACTGTCCAAGCC7CAGABG&A£TtACACCACGAfitAA£AACCTGGGTCGGAC7A 
GGTCAfiCCAAGGGETTCCCAGGCTCTSACCCTGCCAAfiACAACTCAnAGAAfintACCA 

accacacatactattcctcggtctcatcaa&aacccagcgacc 



K.AJ 

UCCAGBCAASATATXACAAAfiCTXAARTT^ACCTCT^BGGCAGAGCAICEATCIKKE 
IL Tt lUJXI IACCACCATBDCATCATATCABBfiCCATtATACAACCATCATCATTTCEG 
GBMCAATAGGBCATACAfiGAATCATATUAAAGCICAMTCCCATCACTTACCCAGAAfi 
AAKTCTCT AAtXCACAGUTTCTCA&AXCtTCTCAAA T AACAACATCT ACTTCAACCTT 
BUCnAOCTAGCACCTACOCAACTCTCCCAAACAACGAOCTCAAACACMtX lli l Clfcl 

1265 0CTCTCTT AATCACTTACCCGCCAAATAAMTCrGGCTCCACACAGTGCAGCCTAGCCTT 

CAATTCTACCACAAA7TGGC777ATTCTTCACAGCTCTCAATCAACArrrAACA7ATGCC 
TCTCTTAGCCT AAATCAATCAATAAATGAATCAAT AAAT AAATGAATCAAATCTCQCCAA 
TGCCTATAAACATTCCTCEGACAGGCAGETGGBBGGAGACACCAGCTTGGUASTCACBC 

[T,C1 

TOTAUTCCTASTTUO^CCTGATACOTACAAATACTAAAACU 
AT77TTACTACA 1 1 1 1U.1L1 l ATtTCTACTttACTTTATTTAlWI I 1 C I UO TCTASA 
CTXAGCCCTTCATGGGCATCAGACCCAACCAGCCACAOEAGGCTXTGAACCCAGAA&ACC 
ATA m 1 C Cfi I TT A A m 1 C T C T C A TCTTASAATTCnAATAAAfiTTTTTATCCCGCATT 
rrXATTnCCACTGAUTTCATAAAnATATAGCAGGCCCTGACTCTACCTCTATACTQG 

24S7 AtCCArBUWTTCrCCTSCCTgAMCCCBCTCEBCCCCCCCCrTW 

CTTCT CC CT CCCC CTCBGCCTCCTSCAOCCCATTAACCTCTACCTC Cli li ttC CAgCCCT 

CCTAAAT G CAACttAA lUA C O T f ACAAAA <K ACCAACA CCCCCCCBC AOCABEAT0CCCC 
A C A fiC A C C CO ECCCCCA&AfcACACECAi^TTCtTCtATCtCT UXt A CtC TCOOBCn 
CI.C.CJ 

CAceggrmccAatcgrttcTCBCTmAg^TUrrTT^ 

CAGCT A TTTAAACAACCTCAA IATreAAAAA£AA>£CSACCTCT AACTCAAfiTCTCTCTC 
TarTttTTCAaXttCtmCACAI C I t rT tt mAAAAATACCATCnArrCTAAATA 



ACTTAnACTTGCAGAAAATATGCAAMTCTATCCCAATCGTT^ 

4438 TCTTATCTAItTCTACTCTCTCArCAATACTA nt lC ti l C I W I » I T IAATTCAATTCTT 
TTEGCATXCTTCTXAAAAAICAATTEACCA T AAATGTtAA£CTtTATTTtT&A£TCTTCA 
ArrCTAATCCA1TUTCTATATCrcTATCCTAACTCATGGACACACAGA£7ASAAGCArG 
GTTACCAAAGGCTGGGAAGGATA&AGGGGAGCTGGOGGAGGAGGTAGOUAGCTTAATGG 
CTACAAAAAAAATACAAAGAAT&A 
[G.A] 

T AACACCTACT ATTTGA T AGCAT AECACGG TGCCT AT ACTCAATAA I AACTCT ACACTTT 

T AAAT AAA&AGTCT AAT ACCA 7TV. 1 1 1 UJUnXAATGCATAAATGCTTGAGCGCATGCC 

TACatATTCTTUmtCTGCaATTTCACATTGCAT&CCTCTA 

TTACTCCAT AAAT AT AT ACACCT ACTA TCT A TCGACAACT A TT AAAAATT AT AAAT AAAT 

AAATTAT ATACCT ATCCTTATCCTACT ACCACACTCCCTTAl 1 1 1 TCCTl 1U ACTAAOC 

4522 TCnATCTATCTaACTCTCTaT^TACTAT t rC C T CT nmn A An^TTOT 
TTUXATCTITCTCAAAMTCAAnCACCATAAATCTCAACCTCT 
A TTCT AATCCA TTCATCT AT A TGTTTATCCT AACTCATC&lCACACACACT ACAAGGA TC 
CTT ACCAAAGGCTGGCAAGCA T AGACGGCACCTGGGGCAGCACCT ACCCAACCTTAATCC 
CTAUAAAAAAATA&\AACAATCAATAACACCTArTATTTUTACCATA£CACGCTCGCT 
CCA) 

TACTtAATAATAACTVrACACTITTAAA7AAA6A6T^MTACCAT1CTTTGCAACTCAA 
TSGAT AAATSCTTCAfiGGGATGGCT ACCCCA TTCTTCA TCA TGTCCCT ATTTCACATrcC 
ATCCCTCTATCAAAAACATCTCATTTACTCCATAMTATATACACCTACTATCTAICCAC 
AACTA TT AAAAATT AT AAAT AAATAAATT A T AT AfiCT A TCCTT ATGCT A£T ACCACATTC 
CCTTACTS1 ran I fO ACTAACCTTTCAMTtA66AA6TATCACTCCCCC6CACTTT6C 

4522 TCTTA TCTA TCTCT ACTCTCTCAP&AATACT ATCTCC TCTCTTGTTTT AA TTQAATTETT 
TTCGCATCCTTCTCAAAAATCAATTGACCA T AAA TGTCAAGGTCT A TTTCTSACTCTTCA 
ATTCT AATCCATTGATCTA T A nTTCTATtCT AACTCATGGACACA&AGAST AUACUTC 



CTTACCAAAfiGCTCOGAAGGATAEAGGGGACCTGGOCGAGCAGGTACGGAAGCTTAATCG 
GTACAAAAAAAATACAAAGAATGAATAAC^C7ACTArrTGATAGCATAGCAGGETGGCT 
It A] 

T ACTCAATAA TAACTCT ACACTTTT AAAT AAACA6TGT AAT AGGATTGTnGCAACTCAA 
TT^T AMTCCTT&^GCSWTSGGT ACCCCATTCTTCATGA TCTGCCT AnTCACATTCC 
A TGCCTET ATCAAAAACA TCTCATTT ACTCCAT AAAT A I AT ACACCT ACTATCT A TCCAC 
AAfiT A TT AAAAA TTAT AAA T AAATAAATT AT AT A£CT A TCCTT A TGCT AGT ACCACATTC 
CCTT ACT61 reCm b I A 6TAABCTrT6AMTCAfiCAACTATSACTCCCCCOCACTTT66 

&07S TTTCTACTAAianfaMro>BCMCT 

TT ATTTTOCCTC I 1 1 Gt AATCCTICATTfCf ATACAAATTrrAGACTCACDCTATCAATT 
TCTACAAOCAAACCACCTAC CC T TCTGCT I CCCAnttCACTCAATCrCTAfiATtACTTTC 
GGttTTATT^CCATCITAAGAATATTAGCTCTTCrcATCCATCAAU 
CCTTTACTTACCTCATCTTTAATnTl H C t TT C ITTIinTT C TTTnT CA CACAfiACT 
IT.CC] 

TCTC^rrCAACC&ATTCTCCTCCCTCAC^ 

CCCACCACACCAACTAATTTTTCTATTTTtACTA£A£ACCCGCT 

OCCTittrCTOCAACTCCTUIXTCCTUTrX^ 

TTACAfiGCtTCACCCACCA C TtCCEG C nr t lTTAATfTTIf llAACCA l^ TTir ni tAT 

54so wTTCTariTarrcAGarraxAAaAECTt«^ 

TAA TTTTTCT ATTTTCACT ACACACEGGCTTTCACCA T ATTGCCCACCCT ACTCTCCAAC 
TtSTttUSlCUUTCCACCttfClCACC^^ 

CACCA C ICC QXtr riCTTTAAIITIIITlAACCAI t linT C IATTTTTCAAACTArAC 

atCttcc at i i c i n r t 1 1 aaattta i n c itt r c ncr r ni aatttcatttcasacta 
[r.C] 

TT ATTttATTCAT ACTCTTTTAGACTCCACA TTCCCTnTUATrtTCACT AACTTTTTTT 
HHtlWHl I lU ASABGTTTCTATCACAATTTtCCACATtA&AEATCAOBCACATCTCA 
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AACTCTCTAA T AH ACtAACfXTCCCCATTT ATCAfcATCAEEA i XX. 1 rT TCC TCATTtAC 

CATGLKGbUATrtACTATCTAMECTCAAAAGCTUTACTCTTn 

CATTTTATTtCTiXlTMTAirTACATAnTATGWCTAIITCTUTArrnWTACCTC 

5450 CAITCroCTGCCTUCCCTCCLUCCAOCTCBCACTAUGCCACATJBCCACUUCCMC 
TAArrrTTCTArrTTCACTACACACS Ci C C T T T C ACCATArTCCCCACCCTACTCTTXAAC 
TTXTUCCTOTUTTXACCCaXTCAA^^ 

CACCirrCCCCKTTTCTTT AATTTTTTn AACUITCTTTTTCT A nTTTCAAACTAT AC 
ATCTTCCAI MCI 1 1 IU IAAATTTAI I III 1 1 IU ILI 1 1 1 lAATTTtATTTCACACTA 
[T.C3 

TTATTECATTCATA6T61 1 1 rACACTCCACAn CCC T C TT t ACTCTtACTAA li rrnTTT 
1 1 1 It l Un i'T Hi AfcA U. TTTC1 ATCACAArTTTCCACATCACACA TGACCACATCTCA 
AACrCTCTAATATrACCAACCCrCCCCATTTATCAMTCAC^TtCrnTSCT ^ ITXAC 
CATGCASGCAAATCI ACTATCTAACECTCAAAAGCTCAT A £ IL1TI f ACATACCCACTAA 
CATTTTATTSCTA£ArAATAACIAOTATTTAreGACTAttTCTOTAnTTCATACGTt 

6995 mTTSCTA^TAATAACTAUTAmATCEACTA gl GnJ TATTTTCATACCTGCATA 
CAATXTGCACTWrtAAArcAGCETtTTTAGSCTArrtATtArrTCT 
TAm C t t I T r a AACATTTCAA fcl CI C I rC AA fiC 1 1 1 1 CA CAAAT ATTCAAT ACATTAT 
TCTTAAJCACTCCT A TTLIACACTQ&AACTT ATTCCTTC7 ATtTAAAGACACT AACATTTT 
AACTA T ACJTAT AACLTT ACACAACUAT AAAtTtTCT A I ACGCAAAATTCCC7 ACAACAT 
tCAj 

A£AATTTCATTCCTTACTCTT ACT AA T A&4CCTTTTCAAACA TCCCAACGAT ATTCCTCC 
CTrOWGnTTtAACATCCACtltTtTCCnATATTCCn^ 
A4CAGGCTTCC0CTCACCA1Tt>GACTAAAATAfiCACCTCTACT ACTCTCT ATCTCCAAC 

CCTAnAnAmrcTTeccccnATtiACTaCTwacTATAatTATA C r c rTn u :! 

It»rTttTnAITATTXA£CACrAACTACAATATAAAAITTCTWl&AGCTACSA rC r TTtT 
0341 ACTCAfAAfiEnACA£UGUTAAAGT6TCTATAfiQCAAUrrTXCTACAAGATGACAAT 



mATTmTAfTmAETUTAX^ISCTtmAAAUTCCCAAGUTAI ITXTTXCTTGB 
MXTTTMAfATttACETCTtTBCTT AT A rrcCTCTCCCTttAAATT ATTttT AAAA&AG 
fiCT I E P D C I l^ CCArreAfcirrAAAATACCAfXrrrACrACTCTCTATXTTXAACCCTAT 
TAninATCnCBCOCTTATtACTCTCTCAXJCrATACTCTArACI C rril tC TI t l lC 

ecu 

rn ATWTTXACCArrAACTACMTATAAAArCrcTCACACGTACCAi L 1 I ILI I ILLU 
CT AT AAACCTACTCCATCCT ACAL 1 1 CCTBC1 fcCA TAATACCTCCTCAATAAATCCTTTC 

CATA&AATAAXCCAfinWrTTUCAnTATTCAATACCCAACATATOC&^CCTAGUTC 
TACATATCCAA L I UUIW I C f C T A I t l t l C I C TC C ATCTCCATCTCTACTTO&ArCTACT 

B4TS AAAgATCTTATCrCACTTCCACAAAACTtTCAS X l CC I T I U 1 1 HI I ' U tfCTTATCA 

TTTTA Jl l 1 1 1 1 1 C ACrCCACTTTATTfASEAT^TAAOreEA£AA£CTreACCAAATT A 

raMMm cectccrBCcnaa i rtrem laBcen nciBBC M 1 1 ntren 

rCTATCACTXA£AnATCCAAAGACATrTrreAGCA£AACA£CTAA&AA 

tc.n 

TTTCGCACCTA H CC T fX I AGAAETCAAATTXA TAAAACCCAT ASCCAAfiATTCCAAACC 
AAA^ ITKIII6GG6CtT ITAA£A£ACACA£CA6CAACrATCES&ACCT&UlAfiCTTTC 
CTACCAATACTCAACGCCATTCTXA T ATTXTTXSCA t TC Ct I T E TC T T t U CA fiCT A TCC 
ArcGCCACtTTtAACrCCCTATAACnAAACCCTAGCTCCCAnACCA/WCTTtt 
UOXTTCCCTVQBCCXCKTIXCrnikXUlCTTOCttt 

10045 TTTCCTTCArrCTCCAOTCCCAACTtCCACTACCTCCACAMn 

TGCT ATTT A TGTCCAAATCACAGGT A TAAlXCACTCTT^TTtl\AACrCCCCTTTCCAT AC 
T MJUXA TUXUAiUAACTtUATCTtU rrtAAA.^UAAjkt*£TCCAA£CT A6ACC 

t a t actgaacc n a id acgegaaaga^tcaag&gcacc \ aagc t caa c acacca£Ca 

CTTCCCAfiAAACCTTCTIXATOCTTTCr^^ 
(CA) 



T AfATtUTCTCTCCCCTCTCTC TTTAtLU^ACA^XCTTCGCXAGGCACACTCACTCA T 
6CATCT AATCCCACCAC r I T £&EAGGCCAACCTG6GAG£ATtA£TTCAffiTCA^&ATTTC 
AA&AIXAGCTGGCCtAACATGCTGAMTtCCATTrrCr ACT AAAAA T ACAAAAA TT A&CCA 
GGUTGGTA£CATCTAfiECCTCTACTTXCArTA£n^GGCAGGCTCA&ACATGAGAATCGC 
TTGUCCUGGMGIGG 

10045 TOTXTTGArnriTiAWTrXCMETTXfJ^ 

TGCTAT£TArcTttAAAT£A£A£CTATAACCCArrCTtAn 

TAUl^TGCCMACAAACTCAAATCTCAA TTtAAAACCACAAAtAETGCAACCT A&MX 
TATACTCAACCnATCTAfilXbVAJIGATTCAACCrX^ 

CTTCCCACAAACCnCl fCATC d! 1 ILIlllXtACAAACTCTTArrCTCAACGCACCAS 
CCA) 

TACATCAAI C I C TC CC C T C T C I C rrTAAAACTACACCrfTCCCCAfiECACACTCACTCAT 
CCATCTAATCCXAGCMXnVKAjCCCCAAGCTGGCAGCA^ 

CGCATGCTAGCATUT AGCCCTCT 

11M4 GCrAACTAAAC0G£&Ll tf reCTCTCTCCATTCC6AAATCCTCC^ 

rACCTATCI 6 1 fTI t l' CfifcC CATCAAAATAAAAAATCAfiTTTCTAAAAATTTAACCAATC 
TACACCT ATTT ATTCAACAATAfiGTCTCTCT AAAAAATTTCTTA I L M C 1 1 ILAGTCATA 
ATAnMTAAAAACATCTGCTCCTCrCTtTT ACA T AT A TTTTCA£A TTTTA TGGCACCAA 
ACtAACTA^XAAAreCTGATACTTA^TACTAACTCCTCTACA ICr&rTTC ATSGACGCC 
tCA] 

MTCTtTACAAAarrAaXCAAACTCTCAe&AMCT ^tt 
ACTTTT^AAAAACAAACAntMTA&AeeCTTrCAAACAAAAACtAT 

14070 QBKt acnnttCBBBat/ ttiax ^ 

TtACTrTTSAATCCAAAC t T Tri TL IAAATCTTCTCACTTTArnTAAAA TCTG EC T ATC 
CTCCTTGACAGCACTGCUGGGTArXIASCVUCrrraUATTEAACTTCG 



6666TAAtOCCCTTCTAATTAr 6Ct TT^ L l W TCAATMn if niAAT66AjlCTCT6 6 T 

CT«TTTtAAA€CAt«TTAT9CTMTMTTtAAAAS^^ 

tt.fi.IJ 

ACCATATATGCALI I T It I CCATGCTOCTTCTXACTCCECTGGGTCTAI mTCCCTTCC 
TC E TUX C T C IGTAABCACATSCCTTATTTACTCArCTSAI C l I I ' U. 1 TC C TCC TTJ C TC 
AGGGTTGTCTTXAnAGATCATAAAAACAEGGCCAjBGCAGUGCCTTCAMTSAAGECAA 
TTTGGTOTSETEGTQCTWTWTCnGGTCn6A£CTCCTCrcCC\B6ATAACTEKA£ 
AA»mCUGCACTtA£EAi^OA£CCTT£AGG^ 

1S535 ACnAXTCCrrrgTTTCAfTCTCATCT ATAAUTeCA<UnAAAAAMA*CCTATCTCAT 
Af^rn c n C TTAttAT^CTCCCrTAATATATATAAACCArrTAfiEAtACTCCCT & GQ 
CTEAA TACATGTT AAATCT AAACT A TAGTT ATCTCAAA T GT C TTTCC TT CCAGGAATTTT 
GCAACACACACCAACAT A TCCACACTT ACACAT ACAT A T ATGCAT ACATCCACAT ALA T A 
TTATAAACA CC A£ACTCA6A<UACCA££TfATAAACAATTTAAECCATAA^ 

[T.a 

AAATACtAttACTTCCCAA C TCI rT Ci a ^ TMnCCAfACAJAfiAAAATCTTAATCTTT 

TTCKrrTCATTGGACTAAACACGAATTXATTTQCCCGAACCTATAU 

AAAAAATCTTTACTTTTTAAATAnATACAAnATUkTCAAAAACCAA^ 

T ACGCAAAA T A TT AMTCTT AAATTT ATTCAAAACTT AAAACtTTTTCAA T ITT TT7 T T T 

rmrmn i t BfiHfiB>cTCrcrATC<CTC4CBCT€B<c3tt< croCTc r B >TCiCM 

17618 OGTAACTCE&UTGI^TCBKAfiA£AAfiAATAAAACOCAn^ACTAAArrTAACTCTAC 
TTTGAATTUTUOAamATIXAAITT&lfiAl^^ 
CCTA6AGGAGCGTTACTAAAGACTAAACSAA£WTTTtACAAWTTTCA 
reWTAXAT^TmAfiECCATareAAMlWreCTCACATS^ 
TUTUIAAfiCTTXreE&AAATCTGGIMCTTTCMG^ 
[C.T.A1 

GG 6CC CI IT S IQ CMG CC C P B C T TTTCTTATCTAAC C T f BC I TCT C CT T I AnSCTTT CCC 
CAtAA TATGGTTTAT ACCACAT ATTTCTTGAACTCAA TT AAAATTT AAACCCCT ATTT AA 
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AGnCTUmTTCOXTC^IUmTTnCCTTCUTCTCCAAAUmiTAAACTC 
CCATTnAmAAMUrntTAITCTjtf.Tr T t l^fiGATEAAACTCTASCACCTTCTa 

^tattut^t AarTrn^ACTG^XAa rrcnti ruj^uarLAairTTttU 

18SZ0 ATTTATttAIAA Al I UC I U CAT I tt l 1 1 It l A ATXAAraCTCTCTCAAATCTCTTAlT. 

TrrrrArrrtAJXTTGHrrcrwitCATT^ 

TTACAACnAAACMT<CCCTTXAACTCWCCtXCt t rT C T t AC<C<fiCTCAAItATTAC 
CTGCAnATTTAACCCTCATlTTACACATCTCCCACCCE^ 

CAGCA TTGArTCT ACACTTCACACAT AATA TTCCATEA T AT A T ACT AT ACTT AACTTT AC 

U.-.C] 

AAATATECArTCAeCACATTTTAAATACTCACA l 1 1 1 1 1 fT ATCACTACAATTTATTCTC 
CCCCCTVrCTTXGCTUCCTAATGCTCTAAT ACASGA£A£A&GA£ACA£A£CTCCAAA TT 
CCACTCTACCATAATCAgSGfAATCATACAfiATA rCT tf TC tt t AACACAAA6ACAIACA 
MiACjW£TACtT.^CCT^GCATCG£AGCTtAAC&^&ilI I LCI I U CATTTAC6CTMCT 
6CAG^TAACTACM^ACCtAfiCreQUA£TCTtATCrCTA 1 1 1 ICiCIACACTTTAAC 

13S2S TTXAT AMTTCTCreTCA TTKTTTTCT AA TCAA TGGTETCTtAAA TCTTTTATTrCTTT 
ATTTt^UXTTSGCTTTUiaunatumCG^^ 
ACrTAAACAATACBETCCAAfiTCEA fcCHt l C I I C I M fiAfiAflCTCAATGATTAfiCrSCA 
n ATTTAACGCTT^nTTA£Af>nrtTXCA CCCtC f l C 1 C ACCMTTTT ATTTXTtASCA 
TTCATTH AGACTTEACACATAA T ATTCCA TCAT ATAT ACT A T ACTT AACTTT ACCAAAT 
t-.T.A] 

TCCACTCACCACArrn AAATACTCACA C T t tTT T t ATCACT ACAA TTT ATTCTCGCCCC 
TC rCTTC K T 6 ACtTAArCGTCTAATA£AfiCACACACCACACACACCTtCAAATTCCACT 
CTACCAT AArCABCCCAATCATACACATA T C TCCTCCCr AACACAAACACATAGAAGACA 
GCTACCTACCCrCGtUTGGCAGCTfJUGGAGA CTT CCT ^ 
ATAACTAG&AGTTAGCtAfiGTGGAAAnVTT^TCT^ 

18S2S tXCATAAATTCTtTCrCATTgl 1 1 [tT AATCAATCETCTCTCMATCTCTT Al Hi ll I 



ATTrtAC C Tl U XI C IUATCCAnttAAATWC&tCTT^lCCClUiXIUXACTTAtA 
AC?TAAACAATACCGTCCAACT1XA(XtXCtXTTCt^ 

mmAAcscTCATmAataTCTtccA tau: ! mo mumTAmciacw 

1T^TT1TACACT1tA£A£ATAArATTCGlT&ATAM 

t-.tA) 

TCCArrCACGACATTTTAMTArrcAGArTTTTTTT AIIACT ACAATTT A TTCTCCCCCC 

m mc g T tA caAATctmMTA fm iiu r <ccA ^ 

CTACUTUTUCCCCMTUTACAMTATCTCCTOCCTAACACAAACAa 

GCTACCTACCCTCCCATGGCACCTXAAfiUtCACTTGCTTGA£AT^ 

A F AACT ACLACTT ACCUCCTtEAAArrCTXATTTrr A TCTTKT ACACTTTAACCA TAT 

19189 CTGGTCAAAGGGACA£AAA£ACA£AAATCCT AAG&A£AATTtA£CAGCA£ACCA£ATAAA 
AAACACC^TATTTCATATGCAAAAGTtAAXTCAArrUAACATTTCT 
ACAnATAAMCTATATCASAfiAItTXATTTTATAACtAMrACAACCCtl UU IACCA 
TAAACTAAAUmAATCTATATACCACAAMTACAATOTUaMIUTTmAAm 
ATTTmAArrcACAAAMrTCT^TATAUTmATATATATAIVrArCTCTCTATAT 
(T.CAJ 

TATATUTETADUCATWTATTTTUTATAmATACAfrSTObUIUaAMTCTAT 
CAAIttACArenCATTAACTCATACTTATCA I IUI fTCTCCTA ACGACATTTAAMTC 
TACCCTTXTAGCAArnTT^ACTATACAW 
AreCATCTTXTAAACnATGCnTXTCTn^ACT 

CCCTCTAATtXCCCArrTrrTXCACAGCCCCTCCTAA ILI1 

192S9 TTTUTATttAAAACrCAACTCAATTCAAACATTTCTA 

ACTATATCAfiAfiAItTXATTTTATAAflSAAArAfiAA CC CCT rTCCTACCATAAACTAAAC 
ATTT AA TCT A T AT AGCACAAAAT ACAA TCTTCACT AATTlATTTTTMITr ATTTTTT AAC 
TCACAAAAAncreCATATACATGTTATATATATAinAlXTCTCTATATATATATUTG 
TACAAtATUTATTTTUTATATCTATACATTCTCGAATUCTAAATCTATCAATCCA£A 
ICT) 



CrranAATTaT ACTTATCA TTTTTTTCT GET AACGACA TTT AAAATCT ACCCnTTA 
GCMTTTTtAAGTATACAMTTOTACTAACTt^ 
TAAACnATCCCTCCTCTTTUCTQUATTTTCTATCCTTTC^ 
CCC«TTnTXCAD«CCCCTgTAACC tf l L Ii n 

CTTnA£ATTTTXACAreTCA£ATT^reTGCAAl I H.1C1 HCltlGCCIGGtl tATTTC 

1932S TCAMGATnXATTTTATAACEAAATASAA fiCCC ITTDCT ACCATAAACTAAAMTTAA 
TTT ATA T AGCACAAAAT ACAA TGTTGAGT AATCA TTTTT AA TTT A TTTTTT AArTGACAA 
AMrrCTCCATATACATCnATATATATATCTATGTCTCTATATATArATEATCTACAAC 
ATWT ATTTTCAT AT ATCT AT ACAXTTCTaAATCACT AAATCT ATCAATGGACA TCTTCA 
TTAACTtATACTTATCA n TT TTTCTCCT AAGCACATTT AAMTCT ACtCTCTT AGCAAT 
K.T] 

TTGWCrATACAAATTEnACTAA£TTXAATCAXAT 

TATCC C TCCT t rCT S ACTCAAATTTTtTATT X TTT C ACTAACAI C C C T C TAATXCCCCAT 
TCItXCACAfiCCCCTCCTAACCACT C TT C TAjCT C TCT tf lT C T ITCACTTTA ATCT T I T A 
WTTTTXACATCTCACATCATCTQCAAI I r CTtT TTCT CT C DCTBBCTTA TTTCAOTAC 
afAArCTtATTXAAATTa iC »C! CMCIU TAMTMCAACATAtl I CIl l tTTCTAI 

19348 CAAATAGAAGCCCTTTCCT AOCATAAACT AAACA rTTAATCT ATAT AGCACAAAAT ACAA 
TCTTCACT AATCA TTTTT AATTT ATTTTTT AACTGACAAAAArTCTGCAT AT ACATVTT A 
T AT AT AT ATCT ATETCTCT A TAT AT A T ATGATCT ACAACATtATA TTTTGAT A T ATCT A T 
AtACTCTGGAAreACTAMTtTATtAATCWCATVnTJTT 
TTTCTCCT AACEACA TTT AAAA TCT ACC C TtnACCAATTTTCAACT AT AC^ 

T AACTCtAA IXACAT ATTCT ACAATCATTTCCT AAATTf ATtiCC TCCItlCl U CTCAA 
Al I riCIATCCniGAl^AACArCtCTCTAAICCrXCAiIi[CiXACA£CCCCTGCIAAC 
CACTCTTCTACTCrCr GC TrCTTT GA GTTTAAI C r TT I A&ATTTtCACA TCTCACATCAT 
CTG&AAintTnrTCTUrUIIUiTTATTTCACTTAfiCATAATCT^ 
m re n CT CATAMrMCAACATAr Il t r C mi CT AimTMITOTACTCaTTCT 



20845 TGTTACTGCAACC IMC r AGATtAGTTGACAATAAATtl&TGGGTGI ATT TCTGGACTCT 
nATtXTCnnAnACTTTATAreiCTCTinTITA£AAfiCTCTAT<XI<iTTTTUiTCA 
CT ACAGCTCTCT A€TCAA TTTt^GATtAGG T AG T ATGA TGCACTTXAGnTTECTtTTTT 
rECTCAAAA ITU, T T l UA T ATTTCA i. n T T ITT ATTtCATACSAATTTTAGGGCTTTTT 

ir rm i n^ nAaCT^TAATcecAnBGAAnnsATO^TTOTTWTCT 

[-.n 

TGGGT ACT A TCGA T ATTTT AACAGTA TT AA TGCTTCCMTTMTCAACACAGGCT ATTTT 
CCAA I I TCrCTT TT CI LL AATTT L T na CCA L T ^ ni miC rTAATTTAATl tn 1 1 A 
ITTtOTACCCTTTCOCTAACA CC T BBTC TTT CC T TATCACTAA i, | TCI I IACTCCT6AT 
TTGTGAGATTrTCATCXAiXCATCAJXTAAG^ 
TCTATTXCTLAXCTCCCTTXCACCArTTTXC^ 

20B4S TCTT ACT GGAACCTTTC T A&ATtAGTTCACAA TAAATCTCTfiGCTCT ATTTCTGGACTCT 
TT AT C CTCT f I IAnACTTTAT A T 6 T X T C f TTTT TT A CAACCTCTATB C f tt TTT T O CTCA 
CTACA£CTCTGTA£TCAATTTCA^TXAGCTACTATCAT^ 
TCCTCAAAA f I C C f I I t C CTA TTTttCTTTfTTIAITCCATAJBIAITTTil flBBCf T TI f 
TTTTTTTTTXAnACTCTCAATAATGCCATTO^ 
[T.C] 

TGEGT ACT A rCGATATTTT AACAGTA TT AATT£TTTXAA1TMTGAACACAG£GT ATTTT 
CCAAI t rCTOTTTCTTCAATTTCT ITCACCAC r & TTT T nTCTTAATTTAAI T t I T T I A 
TTTOaTA Ctt ni Uj: tAACA a i Ct r t HT « rTAT6ACTAACTTCm 
TTVTt^ TTTTtATCCATCCA TCAJXT AAGCAGT AT ACACTCTACCCAA TTTCTACTCT 
TCTATTXCTCACCTCCCTTXCAJXArro 

22234 ASAAACTTTn ACTTrAATTAACTCCCACCT ATTT A 1 I 1 1 1 IU.J IU IU HU U I H, 
B6CTTCT I TTCnTTBBCnBCT TT TEC AT CTXCT n IC 0ETTCTTC6TCATCAACTCTT 
raCTAAfiCCAATATCrACAA BCC TT I TTC 1 6 ATCTTCTACAAITTTTATCCTTCACCTC 
nAamAAGTtrrrWTCaTCTTUCTTGATTTTTCT 



cacmaittmiareittcnaounAittctfiACAATTTCTTWATAfice 
(T.a 

TAATirTf AJMCCTTTAIATAnTiSCTCTTtCTATTTTCCCT ACATATTTATTT ACAAC 
TATCATATCCTCCTCATGCATTCACCOCTTTtTCATTATATAAIUbltl ICTTGICTCTT 
TTTACA L I I I ntll ll A A^fitlTAATTTCTtTGATAAAACTTCAGCTAlXI 11U.1C1L 
TTTlWTITrrAnTGCAIBOUIATTTTTrrcCAACCCTTttCAITCACTCTAICTCTC 
TTCTTAAACATCAAAICAGATGCTCTAfiCDCCATAI U. 1 1 UX1 L 1 1 1 It 1 1 ATTCATTC 

22234 AfiAAACTTTTTACTTTAATTAACTtCCACCTATTTAl II 1 1 IU1 IU KTICTTfTTTG 

cttMUMULiMiaCMaiiiicarCTWinrettncTTGCTaTCAACTcn 

TBCCTAACCCAATATCTACAAESgl 1 1 1 1 C i UTOTCT AUATTTTTATCCTTtACCTC 
TT A£A TTTAACTCCTTCA TCCA TCTTWACTTWrTTTTCT ATAACCTGA£A£ATKAGGAT 
CCACTTTttTCCTTCTAC A IUUXI H XCAATTATPCCACTACAATTTCTTCAATAfiCC 

TAATATTTAAAflCTTTAtATArrrAfiCTCrTCCTAITTTOCnACAIAlTTATTTACAAC 
TATCAIATCCTCCT&ArSEATTCAtXtC 1 1 rCTCATTATATAATCCTCTTCl rCTCTCT I 
mAC Mi nTT UIC nAAAgXTAAmCTCreATAAAACnCACCTAIXniU,lClt 
TTTTKTTTCIATTTGCATECA^IAItn riTttAACCrntECATTCACTCTATCrtTG 
TTCTTAAAfiATGAAATSAUTGCTCTABCGGCATArGCI ICGCICTTCTTT1 ATTCATTC 

722*1 TTTAATTAACTCCCACCTATTTA 1U rTTCCTTCTTCTTCTTTTI rCCCCTTCTTTTCTT 
I [BBCTTttTTTTCCATtTCCTTTTCOCI 1C I la T CA TCAA t lCC HECCTAAGCCAAT 
ATCTACAA UX TT f T rCT ^ TCnCTACAATTTTTATCCTTCAfiCTC^ACATTTAACTC 
CTmTCCATCTTWCTCATTTTTCTATAAOTCAfiACATWeM 
TCTACATCTOC1 1 B CCMTTATOCCACTACAATTTCITCAATACCCTTAATATTTAAAg 

(cn 

TTT AT AT ATTT ACGTETTtCT A TTTTGGCT AEA TA TTT A TTTACAACT A TCAT A TCCTCC 
TWttATTCAC UC I I [ Clt ATTATATAAI U,llMU I UUU I 1 1 1 ACACTTTTTG 
TCTtAAAflCCTAATTTCTCTtATAAAAETTCACCTA U. il 1U. 1 CI t l 1 11U»1 1 1 C1 AT 
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ttcatbput ai urn iccA AarTTC6CAntACrnAria;&ncTTAAA6Are* 

AArCA&tTgTCTACEeSCAtArai IUJUC I It TITIATTCATTCATTCACCCACCCT 

22334 STTCTTCETCATtAACTCT 1 1 UX 1 AACtXAATATCTACAAUA, till rCTCATCTTCTA 
CAAmnATBCTTCAfiCTmACAmAACTCCTTWTCCATCTTMSTTCA 
ATAACCTCAGACATtACSATCCAGTTrCATSCTTCTA^ 

CTACAATTTCTTCAATAOCCTT AAT A TTT AAACCTTT AT AT ATTT ACL It T TCC I ATTTT 
CGCT A£AT A TTT A TTT ACAACT ATCAT ATCCTCCTCA TGLATTCACCCCTTTCTCATTAT 
U« 

TMr U. r inl l lLaiUUll A£A ^ llt! Ull llAAAgXTAArrrCTCTUTAAAA 

encACCTAcen hctctci 1 1 lun i C TAmajtrecMTATTTrmttAACCCT 

TOECATTCACTCTAItlt IfcTt CnAAAEArSAAATtAfcATCCTCTAGCGECATATCCTT 
UMU HI 1 1 IA nT^nCATTCABCCACCCTTTTMTTA6AfiAAriTAATTCAnTCT 
ATTCAAGCTMTTATTtACACACAAflGAnTACTACTCCCATTTTCTTAAT ICTTTTCn 

23033 A 1H1 1 1 VI IbCltl ACTATA Ui l 1 1 l ltf l 1 1UU.1 1 ACCAmfiGCTT ACATAAACC 
ATACTTATAAAACECTATTTTAAACreATAACACCrTAACTTTCAACACT^ 

ATACAcrTTiACTtTACCAACTCCCcrtiArrrTATvrrrr^ 

CnTTGGACATCTSTCCCmATTSTWATTXOTAACAM 

TTTTAATAC1 1 1 TO 1 1 1 A ACTTTATACT ACACATACAATTAATTAACATACCACCAC 

tT.-J 

ACAnATTAOCCTATTCTAMTTCACTATCTAnTACCTnATCACTCAW 
TCAATTTTCA TCTTCTT AATT ACT A nTTTTtUTTTWAXTTGCACMTTtACA TT AfiCA 
TTTTTTCT AAGATGGCTCT ACT ACnCTLAACATCCTCMrrTTTCTTT A TCTSGACA TC 
TCTTTACCTtTCCI l U TTTTCAAATATAA l l 1 1 fCTl CCA TCATTCAAATCCACAAAAT 
TCTTTTTTTAArrATCCAAACTGCCAfiEGTAAGCACAATTALILI rTTTTTTTTTTTCTS 

23036 TTr re rre C TCTACTATAC fc t 1 1 1 1 GC TTT C T 6C TTACCAT6AG66TTACAT AAAGCATA 
CTT AT AAAAGGCT A TTTT AAAJtTGA TAACAGCTT AACTTTtAACACTT AAAAAAACTAT A 



CAOTTTACTCTAiXAAXrrcCCnTXArmATCTTTrT^ 
nGSAGATCTETCCCCTTATTCTCTATTXCTTAACAAAnATTCTAiXAACAGTC 
TAATA GT T T IPX TT n AACTTTATACTACACATACAATTAATTAACATACCACCACTAC 

[-.JO 

TT ATT AGGCT ATTCT AAA JTCACT A TCTA TTTA£CTTT ATCAGTGAGA TT T TO T TT T CA 
AITTniATCTTCTTAArrACTArrCrrrCAlTItAA^ 

TTTCTAAfiATCGCTCTiCTACTCCTCAACACCCTCAACT fTTCTI lATCTESASATCTCT 
TTAC CTXTCCT I IA TTTT6AAATATA AU 1 1 HI 1 LL ATCATTGAAATCGACAAAATTGT 
TTTTIT AATTATCCAAACTCCCACECT AACCA&AATT A t 1 C I fTTTTTTTTTTTCTCA M 

23421 CTTTCATTTCAACTTGCACAAITCACATTAGCATTTTC 

TCAACACCCTCAA ll 1 1 1U 1 1 ATCTCCACATCTCTTTACCTCTCCTTCATTTTGAAAT A 

TAACTnT C nCCATCATTCAMTCCACAAAAn c nTTnTAATTATCCAAACTOCCAC 

ECTAACCAtAAnACTt linTITTf ITTr CTCAfcACCCACTTTT^TTTTCTn^XCAC 

CCTCCACTGCACTQCCGCAATtTCTCAGCTTACCGCAACC^ 

[A.O 

ATTCTTCTGCCTCAGCC1TCCTSACTACCTGGCACT 

CTAATTTTCCATTTTTACT ACACACC fiGC nT t T CCA TCTTfiCTCAi tt T C CT C TT S AAC 
ACCCCACCTUCATCATCCmCACCTAGCCCT^^ 

ECIIAC T^ X OXTCCCCA£AATT ACTCTT A TTT A 1XCTGAGCTTCAGEAA&AAACAATTCA 
AAAnAAAATTTXACATTACCTAATGGCCAAAGCCTCCATTCAAAATAACTAATCACAM 

2SS82 CCCAAAGGCACATAGCCACTTGCAGCAAAGCTAAGCCCACAATCCATCTCT 
CWCCCAfleCTCTCTTCCATTCTBBCACATCATTTCT 
TTTGAGACCGAGCTGAAACTTCATCGAAUTASCACCAGCATCTTIATC^ 
OgttlimTTUmUmTUTMTATaaXrrTATAAATATAaAamMTACT 
TAATATA6A6CCTTCAGACCCATTATCTCA 1 1 1 UCOC I m CAATCCAATCTTAACAGA 

[T.a 

CCTT A T ACAA TW TTTACACTrCACTCAACACITTTAAGTACTlTCAATCTGIXCCAAAA 



TCCA6A66C ACCCrr MTtTCTAfiATCACAnAACT^TCTCAEtASACCT AfiAACTTCT 
CC6GMACCCTCA£TCT«GA£CCTAGA£n^ 

CnATAGGAABCAGAGGGCTCAISTCAGACATAnATCTGATTCMTCTrCTA 

A TGTCTT A£&AAECAACCCAA£AGGA TTGCTTCTGGCAAACACCT ATAGCCTCTT ACTGT 

26407 (XTCTrjUGAEACACTTTTtXTATCT^ 

TCACTCCAACCTCT6CCTTX CA 6GTTTAA6CA C lTCIfXnX C T CA CCC^ 

TG^nAJjtfETajUJUXAcarreccAM i I n 1 i i atttttattacagatgggctt 

TCAOATCTTGECCAjGGCTACTC^ 

CCCAAACCCCTCCEACTACACCCATGAGCCACTCCACCCACCCA fcl I L l tl ( X l M IATA 
[CA] 

CTAAATTCTCTCCAGGACTGCnAATACTCCATTMTAGCTAT^ 

CCTUCCCATATAATCCCAATATTTTCTUCACCAACC^^ 

CACTCTUUCTACCOGCGCAAUTAGGCA£ACCCTCT 

ACACATACCCACCCAIUCltT 1CCATCCTTCTATTCCTCCCTACT r CQC OEACTCACCCA 
HiAKATCATTTCACCTCAGAACTTCAACCTTACCCT 

26473 CAACCTCTBCCTTXCACCTrTAACCACTTCTCCTCCC^ 

T ACAGCT<XACACCAC£CCTGGCAAA 7TTTTGTA 7TTTT ATT ACAfiA TGGGCTTTCACCA 
TCTTGGCCAfiECTACTntAAGCTCCTUTtTCUG^^ 
GCCCTCCCACTACAGCCATGAGCCACTCCACCCAGCCAknClblCCn ! IATACCTAAA 
TTCICTCCACCASTCCTTAAT ACTCCATT AATAflCT ATTTACCCCACOCACACTCECTCA 
(C.T) 

ccatataatcccaatattttctjgacaccaacocggaagactgcitgaact 
mgactagcctbbgcaacataficcacac tc h t c i tt a ca a aaaaaaaaaa gac a gac at 

AGCCACCCATCETCTreCATCCTinAnCCTCCCTACTTGG^ 
TCACTTCACCTf^ttAACTTCAACOTACCGTCAGC^ 

CTUTTUCA£CCCACACCCTGACTCTAAACAAAAACAAAAAACAAATATTTAACTAAn 
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U.Q 

TTCTiCTJWAnnACrATJintaSMCAAAaTaTTTTnCAACrAnUCTTTim 
OI I U IL1U 1 M CCAACTCGin6BECtf6ACTTTCCCA'rC4TTCACnM*CCTAAC 

UTTOXTTWiTrrarrajurnu&ACT^^ 

CCCCMCUTTTTATCCTCMCCCCAAEAATCEGATCTITTTGCACCTUAEAAACTCTC 
TtAATCTTACAirTCtfECTACAATCATTAAACCT A t. I I 1L1 I H 1LGAACTTAAATTTA 

TATOCAACTMTAVlCTCCAICTATCCTCACTCTUAAUTTCCCAA£ACTA£AJUAICAT 

CTACAJlTAMAATTnjLWTtTCACrrCACnACCCtACA 

TA C T IC I nTC C TAAAlACACAATAA I TTCCTC T ttA ITCTTTCAfiACTTTTTPCTATAC 

ATTTT AT ATST ACAAA TGT AGCAATGT ATTTCT A T ACATCTt^rCATTCCT AT ATTWTT A 

TTWTTTTTrTtACTTAATAAlAArTtACCTTATTCCn ATTCT 

[A, 6] 

TMrArcAAICTACTATAAmAmAACTATTTTtmATTOCamAAOTATTTC 

TACITTTAAAAAUTGCTTCTCAAraGCAACAAAAfiCCAAAAnUCAAATCGUTXTAA 

nAAACTAAAGA£CTTCKCA£A£CAAAACAAAnAC^ 

CA€MreGSA£AAAATTTTTttAACCTACT^ 

ACMTGAACTtAAACAAATGTACAAGAAAA^^ 



CTtATCATTtXTAT ATTCTTA TTGA [ 1 1 1 1 T I LACTTAATAAAAATTtACCTTA TTtXTT 
ATCA 7 1 GC TTT ATECT A TTCTCT AAT ATGAATCT ACT A I AA TTTA TTT AACT ATTTTCCT 
TATTQSCATTTAAGTTAnTtTACriTrAAAAACATGCTTCTCAATCCC^ 
AAAATTCACAAATBGGA TCTAA TT AAACT AAACACCTTCTCCACACCAAAACAAACTACC 
A TCACACrTA^TGGCCAGCCTACACAArCC&ACAAAA f T IT I CC AA£CTACTtATCTCAC 
IKO 

AACGCCTMTATCCA£AATnACAATGMCTCAAACAAATCTACAACAAAAAAACAACCC 
CA TCAAAAAGTGGGTGAA£&< T A TCAACAGACaCTTC TCAAAAGAAGACA TTT ACGCACC 
CAAAAGACACATGAAAAMTCCCTATCCTCACTCIXCAItA^ 
CACAATCA£ATAIXATCTCACACCAOTA€AAraGCAAT^ 



CAGCrGCTGCACAfiUTGTGMGAAATAGGAAGAnTrrACACTGTTGGTGGCAGGAGAA 

30417 ATTCA TCT ACCT AAAATCT AT A TAT AAAAAAATCCCTCCCTTKAA TTCCAGATCCTTGGA 
CACAAACACCCACCTCTAAAA£CAAATTTCTTTAA£ACTOCACCACTCCTTXTC 
TTrCCATTTTGTtACTAnTTGTCAGCTGGTATAC£AATATCC^ 

TrccT r t n r t n t c i gg i acaaacccaaat aaattacaaacatcaataaaactaaaatt 

CT AAAAT AACTCACTTTCTCTA TATATCTCCITCT TGCTGGAAAAA TGGG TT AGGTT AGT 
tl.-l 

CTTT AAAAGCA TGCATCA T AAATTET ACTGAAT ACAA T ATTCAGGTCTGGACA T ACT AGG 
TATAAI rTT CTCTCTC I C T6B6C TT^ACCTATTTC666TtAAMTAAACAACTTTATTA 
AGCTTATTAATATTCAArrnATTAITnTTnAACAATTAre 
TttCAATAArnATn t TCACCITCCCACCTCCTTCTAAACnCTCTCTA fTTTI I CA T A 
TTXMTTnACmAMTATTmACAAAACAGGTCTmAM 

30TB3 TT mre r C T C T C I CQ OCTOTAIXTATTrGS^CAAMTAAACAACmAnAACCTT 
ATTAATATTCAATTTCATTAT ClItl 1 1 AACAATTATGTTt^XTGCTACTTTtATTCCCA 
A TAATTT ATTTGTCAQCTT CCCACCTGCTTCTAAACTTCTGTGT ATTnTTCATATCCAA 
TTTTACTnAMTATTTTTAGAAAAGAGCTCTGnAAATTTO 
TTGTTrTTTCACTGACATTTTGTCAATTCAAA 

tea 

AAATATCTSCtACAGACMTTTTmAAATAAGAAG 

ATAMTArniAATATACCrTATATTrTTCTCACACATTTTTATACC^ 

TTCT A TA TCA T AT AAATGAT AACAACTTCACAAAGGCATTCCTTT A TCCCTT AACTCTCA 

AA TT AGAAACTTTCAT AGGT AGGAAGT AGGGGAAGCA T AT A TTTXTTTTGAAAGGTGCAA 

GAAAATXTCATTGGCATTCACCATGGTACTtTTCAAGCTTAAAAAAAATGCA 



(79) 



2SM4 TGGGtAACAT Af r CMUCCC I G I C TTI AIJUAAAAAAAAAAGACAGACATAGCCAGGtAT 
CGTmGaraTTGTATTCCTGCCTACTTCGCCGACTGACg ACG*SGATCACTTGA£C 
TXA&AACITCAABEnACCCTCAGCAATCTTXACGCCACTGCTCICCAGC^ 
GGCCAGAIICTGACTCTAAALAAAAACAAAAUCAAATATTTAACTM 
CCAGAAAATATAAGCATGGTTTArCACTTTGATATGACACCAACAtXTA 
CCA} 

TtArCAATTCACTAAAncrrCTCTC G AAACCTAAGCTCCtAACCCAA 

TAfiCTCCTCCT CACTGCTCTCA TCACCT ACACCACCCAGACCA ITCCCAGCAGCT ACCTC «2S5 
TTCCnTtAAftUCAAAACTtlTCTnAAtMlXACABWCCCCACAACT tO Cl CTTTCTC 
CraCACTCTCTTTTAl I rCCCTCCI I I I I MGCCATtACCCTGCTTCTTACTATTTCCCC 
TTTrCAfXACAAaXTGCTGTatXAAAAA 

ZS3S4 CTTCCAGGGAACCGT A£A f CTTGG T UXTA rTTGABCCCCAAAGGATtAGTTAGTTTTAC 
AAAGSACAATCCTATTCTCTCTCACATCCTTTTTCGt^TW 
A TCTAAGCT ACTGCT CAT A&GCTCAA TGCAGTCCACCTTCAAAGCAAGAGAMTAATTTC 
AIGAGTAACTCCAACTGCCGCCTTCnATAfiGGAAfiCQTCA 
CAMTTtTCACAGTGAACAAmAAGTaAAACTTCAA^ 

AAAAATATCArnTACTGTGTACTTCAG At T TCT I GT ACT ACT ATTTT ACT ATAGTtAGA 

ACAAACATCAI » 1 1 rCAACTATCACI I1CI f I CCC TnTCT C ritACCAACTCCATTCG 294S4 
CCAGGAGf 1 1 bC L ATGATTGACTTAAACCTAACCArrCCCTTGA I rCI CC lCXACTTCAC 
AGTGACTCCAGACCCCACCA O CCCTCTTA C I I I CCCCA ACCATTTTATCCTCAAGCCCAA 
CAATCBGATCT ATTTCCACCTGAACAAA C TCI t T G AATCTT ACATCTCAfiOCT ACAATGA 

28417 CAGCttCAAAGGAniACnACTTnACAAAGGACAATOT 

tggccatgcctcaaaaccagtcccacaatgtaagctactgctcata 

CACCTTCAAAGCAA&AGAMTAArrTtATCAGTAA 

AAGGCArCAIGTTGGAGCCTCCCACntAAAITCTCACAGT^ 

TTCAAAAGTTTCAATGCaTTTGGTGGAAAAAATAnACm 
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tAfcrcAjTAC «cto>T«aui lumm w auwowi TcOTcaa 
ckxectm* maaic Kiaimi ccmrjccb kmiimi. 



irrr»r-»«c a wretm Kfrcrmir scrrwrct luoamo 



TYi-Micaci ifjiifrrr iusiuuii trocrxfcut uwkku 
rc**a&Gar ti r ca n* .memc** rtixjan lit mMwr 
crcMCbMc ucmccrr cccneoigu naaimc cw cwim 
t t mtm tf coooerru scocmccct ccrocvcrc? 



CTACm£CA TKIOflCN CPTCAMJCM. 

ctootwoj CMOAerrra ccatg»ttw 6TnuuraiT» MSwrrwsf 
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ISOLATED HUMAN DRUG-METABOLIZING PROTEINS, NUGLEIC ACID 
MOLECULES ENCODING HUMAN DRUG-METABOLIZING PROTEINS, 

AND USES THEREOF 

5 RELATED APPLICATIONS 

The present application claims priority to provisional application U.S. Serial No. 
60/241.745. filed October 20, 2000 (Atty. Docket CL000897-PROV), and is a coolinualioa-Ln- 
part of application U.S. Serial No. 09/739.456, Bled December 19. 2000 (Atty. Docket 
CLOO0897) continuation-in-part of 09/81 8,647 filed, March 28. 2001 (Atty. Docket CL00897- 
10 CIP) and U.S. Serial No. 09/852.067 filed. May 10. 2001(Atty. Docket CL000897-CIP-B). 

FIELD OF THE INVENTION 

The present invention is in the field of dnig-metabolhdng proteins that are related to the 
oraega-hydroxyiasc cytochrome P450 drug-metabolizing enzyme subfamily, recombinant DNA 
molecules and protein production. The present invention specifically provides novel drug- 
1 5 metabolizing peptides and proteins and nucleic acid molecules encoding such protean molecules, 
for use in the development of human therapeutics and human therapeutic development. 

BACKGROUND OF THE INVENTION 

PTMfrMrtabolizwg Projcjos 

20 Induction of drug-metabolizing enzymes CDMEs") is a common biological response to 

xenobiotics, the mechanisms and consequences of which are important in academic, industrial, 
and regulatory areas of pharmacology and toxicology. 

For most drugs, drug-metabolizing enzymes determine how long and how much of a drug 
remains in the body. Thus, developers of drugs recognize the importance of characterizing a drug 

25 candidate's interaction with these enzymes. For example, polymorphisms of the drug- 
metabolizing enzyme CYP2D6, a member of the cytochrome p450 C'CYP* 7 ) superfamily, yield 
phenotypes of slow or ultra-rapid metabolize rs of a wide spectrum of drugs including 
antidepressants, antipsychotics, beta-blockers, and antiarmythmics. Such abnormal rates of drug 
metabolism can lead to drug ineffectiveness or to systemic accumulation and toxicity. 

1 
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For pharmaceutical scientists developing a candidate drug, it is important know as early 
as possible in the design phase which enzymes metabolize the drug candidate and the speed with 
which they do it Historically, the enzymes on a drug's metabolic pathway were' ddennmed 
through metabolism studies in animals, but this approach tuts now been largely supplanted by the 
5 use of human tissues or cloned drug -metabolizing enzymes to provide insights into the specific 
role of individual forms of these enzymes. Using these tools, the qualitative and quantitative fate 
of a drug candidate can be predicted prior to its first administration to humans. As a 
consequence, the selection and optimization of desirable characteristics of metabolism are 
possible early in the development process, thus avoiding unanticipated toxicity problems and 

1 0 associated costs subsequent to the drug's clinical investigation. Moreover, the effect of one drug 
on another's disposition can be inferred. 

Known drug-metabolizing enzymes include the cytochrome p450 ("CYF") superfemily, 
N-aceryt transferases ("NAT**), UDP-giucuronosyl transferases CUGT"), methyl transferases, 
alcohol dehydrogenase ("AD FT*), aldehyde dehydrogenase C*ALDH"), dihydropyrimichne 

1 5 dehydrogenase ("DPD"), NADPH:quinone oxidoreductase CNQCT or "DT diaphorase'*), 
catechol O-methyltransfemse C*COMT"), glutathione S -transferase ("GST"), histamine 
methyhransferase ("HMT"), sulfbtransf erases ("ST"), thtopurine mcthyltransferase ("TPMT'X 
and epoxide hydroxylase. Drug-metabolizing enzymes are generally classified into two phases 
according to their metabolic function. Phase I enzymes catalyze modification of functional 

20 groups, and phase II enzymes catalyze conjugation with endogenous substituents. These 

classifications should not be construed as exclusive nor exhaustive, as other mechanisms of drug 
metabolism have been discovered. For example, the use of active transport mechanisms been 
cJiaracterized as part of the process of detoxification. 

Phase I reactions include catabolic processes such as deami nation of aminases, hydrolysis 

25 of esters and amides, conjugation reactions with, for example, glycine or sulfate, oxidation by 
the cytochrome p4S0 oxidation/reduction enzyme system and degradation in the fatty acid 
pathway. Hydrolysis reactions occur mainly in the liver and plasma by a variety of no a -specific 
hydrolases and esterases. Both deaminases and amidases, also localized in the liver and serum, 
carry out a large part of the catabolic process. Reduction reactions occur mainly intracellularly in 

30 the endoplasmic reticulum. 

Phase n enzymes detoxify toxic substances by catalyzing their conjugation with water- 
soluble substances, thus increasing toxins' solubility in water and increasing their rate of 
excretion. Additionally, conjugation reduces the toxins' biological reactivity. Examples of 

2 
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phase □ enzymes include glutathione S -transferases and UDP-glucnronosyl transferases, which 
catalyze conjugation to glutathione and glucuronic acid, respectively. Transferases perform 
conjugation reactions mainly in the kidneys and liver. 

The liver is the primary site of elimination of most drugs, including psychoactive drugs, 
5 and contains a plurality of both phase 1 and phase D enzymes that oxidize or conjugate drugs, 
respectively. 

Physicians currently prescribe drugs and their dosages based on a population average and 
fail to take genetic variability into account The variability between individuals in drug 
metabolism is usually due to both genetic and environmental factors, in particular, how the drug- 

10 metabolizing enzymes are controlled. With certain enzymes, the genetic component 

predominates and variability is associated with variants of tfao normal, wild-type enzyme. 

Most drug-metabolizing enzymes exhibit clinically relevant genetic polymorphisms. 
Essentially all of die major human enzymes responsible for modification, of functional groups or 
conjugation with endogenous subsituents exhibit common polymorphisms at the genomic level. 

1 5 For example, polymorphisms expressing a no n- functioning variant enzyme results in a sub-group 
of patients in the population who are more prone to the concentration-dependent effects of a 
drug. This sub-group of patients may show toxic side effects to a dose of drug that is otherwise 
without side effects in the general population. Recent development in genotyping allows 
identification of affected individuals. As a result, their atypical metabolism and likely response 

20 to a drug metabolized by the affected enzyme can be understood and predicted, thus permitting 
the physician to adjust the dose of drug they receive to achieve improved therapy. 

A similar approach is also becoming important in identifying risk factors associated with 
the development of various cancers. This is because the enzymes involved in drug metabolism 
are also responsible for the activation and detoxification of chemical carcinogens. Specifically. 
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AbnorraaJ activity of drug-mctabotizing enzymes has been implicated in a range of 
human diseases, including cancer, Parkinson's di sease, myetonic dystrophy, and developmental 
defects. 



10 superfafnily has immense diversity in its functions, with hundreds of iso forma in many species 
catalyzing marry types of chemical reactions. The CYP superfamily comprises at least 30 related 
enzymes, which are divided into different families according to their amino acid homology. 
Examples of CYP families include CYP families 1, 2, 3 and 4, which comprise endoplasmic 
reticulum proteins responsible for the metabolism of drugs and other xenobiorjes. 

15 Approximately 10-15 individual gene products within these four families metabolize thousands 
of structurally diverse compounds. It is estimated that collectively the enzymes in the CYP 
superfamily participate in the metabolism of greater than 80% of all available drugs used in 
humans. For example, the CYP 1 A subfamily comprises CYP 1 A2, which metabolizes several 
widely used drugs, including acetaminophen, amitriptyline, caffeine, clozapine, haloperidol, 

20 imipramine, olanzapine, ondansetron, phenacetin, propafenone, propranolol, tacrine, 

theophylline, verapamil. In addition, CYP enzymes play additional roles in the metabolism of 
some endogenous substrates including prostaglandins and steroids. 

Some CYP enzymes exist in a polymorphic form, meaning that a small percentage cf the 
population possesses mutant genes that alter the activity of the enzyme, usually by diminishing 

25 or abolishing activity. For example, a genetic polymorphism has been well characterized with the 
CYP 2C19 and CYP 2D6 genes. Substrates of CYP 2C19 include clomipramine, diazepam, 
imipramine, mephenytoin, moclobemide, omeprazole, phenytoin, propranolol, and tolbutamide. 
Substrates of CYP 2D6 include alprcnolol, amitriptyline, chl orpheniraminc, clomipramine, 
codeine, desipramine, dextromethorphan, encainide, fluoxetine, haloperidol, imipramine, 

30 indoramin, metoprolol, nortriptyline, ondansetron, oxycodone, paroxetine, propranolol, and 

pro pa fenone. Polymorphic variants of these genes metabolize these substrates at different rales, 
which can effect a patient's effective therapeutic dosage. 



5 



Cytochrome p45Q 



An example of a phase 1 drug-metabolizing enzyme is the cytochrome p450 ("CYP") 
superfarnily, the members of which comprise the major drug-metabolizing enzymes expressed in 
the liver. The CYP superfamily comprises heme proteins which catalyze the oxidation and 
dehydrogenation of a number of endogenous and exogenous lipophilic compounds. The CYP 
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While the substrate specificity of CYPs must be very broad to accommodate the 
metabolism of all of these compounds, each individual CYP gene product has a narrower 
substrate specificity defined by hs binding and catalytic sites. Drug metabolism can thereby be 
regulated by changes in the amount or activity of specific CYP gear products. Methods of CYP 
5 regulation include genetic differences in the expression of CYP gene products (Le., genetic 
polymorphisms), inhibition of CYP metabolism by other xecobiotics that also bind to the CYP, 
and induction of certain CYPs by the drug itself or other xenobiotics. Inhibition and induction of 
CYPs is one of the most common mechanisms of adverse drug interactions. For example, the 
CYP3A subfamily is involved in clinically significant drug interactions involving nonsedating 

1 0 antihistamines and cisapride that may result in cardiac dysrhythmias. In another example, 

CYP3A4 and CYP1 A2 enzymes are involved in drug interactions involving theophylline. In yet ' 
another example, CYP2D6 is responsible for the metabolism of many psychotherapeutic agents. 
Additionallly, CYP enzymes metabolize the protease inhibitors used to treat patients infected 
with the human immunodeficiency virus. By understanding the unique functions and 

1 S characteristics of these enzymes, physicians may better anticipate and manage drug interactions 
and may predict or explain an individual's response to a particular therapeutic regimen. 

Examples of reactions catalyzed by the CYP superfamily include pcroxi dative reactions 
utilizing peroxides as oxygen donors in hydroxy! ation reactions, as substrates for reductive beta- 
scission, and as pcroxyherniacetal intermediates in the cleavage of aldehydes to formate and 

20 alkenes. Lipid hydroperoxides undergo reductive bcta-clcavagc to give hydrocarbons and 

aldehydic acids. One of these products, trans-4-hy droxynonenal, inactivates CYP, particularly 
alcohol-inducible 2B1, in what may be a negative regulatory process. Although a CYP iroo- 
oxene species is believed to be the oxygen donor in most bydroxylation reactions, an iran-peroxy 
species is apparently involved in the deformylation of many aldehydes with desatuxatian of the 

25 remaining structure, as in aroraatization reactions. 

Examples of drugs with oxidative metabolism associated with CYP enzymes include 
acetaminophen, alfentanil, alprazolam, arprenolol, amiodarane, amitrtptytine, astemizole, 
buspiioae caffeine, carbamazepine, chlorpheniraintne, cisapride, clomipramine, donupramine, 
clozapine, codeine, colchicine, Cortisol, cyclophosphamide, cyclospoiine, dapsone, deaipramine. 

30 dextromethorphan, diazepam, diclofenac, diltiazem, encainide, erythromycin, estradiol, 

felodipine, fluoxetine, fluvastatin, haloperidoL Fbtrprofcn, imipramine, indinavir, mdamethacin, 
tndoramin, irbesartan, Udocainc, losartan, macjolide antibiotics, mephenytoin, methadone, 
metoprolol, mexilitene, midazolam, moclobemide, naproxen, nefezodone, nicardipine, 
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nifedipine, nitrendipine, nortriptyline, olanzapine, omeprazole, ondansetron, oxycodone, 
peciitaxel, paroxetine, pbenacetin. phenytoin, piroxicam, progesterone, propafenone, 
propranolol, qui ni dine, ritonavir, saquinavir, sertraline, sildenafil, S- warfarin, tacrine, tamoxifen, 
tenoxinnm. terfenadine, testosterone, theophylline, timolol, tolbutamide, triazolam, verapamil, 
S and vinblastine. 

Abnormal activity of phase 1 enzymes has been implicated in e range of human diseases. 
For example, enhanced C YT2D6 activity has been related to malignancies of the bladder, liver, 
pharynx, stomach and lungs, whereas decreased CYP2D activity has been linked to an increased 
risk of Parkinson's disease. Other syndromes and developmental defects associated with 
1 0 deficiencies in the CYP supcrfamily include cerebrotendinous xanthomatosis, adrenal 
hyperplasia, gynecomastia, and rayctonic dystrophy. 

OrneRa-HYdroxYlase Cyto chro me P4 5Q 

The novel human protein, and enc o d ing gene, provided by the present invention is related to 
1 5 the arnega-bydroxylase cytochrome P450 family, which includes, for example, cytochrome P4S0 
4A4 (CYP4A4), cytochrome P-4S0p-2, prostaglandin omega-bydroxylaae, and lauratc omega- 
hydroxylasc. Omega-Hydro xylase Cytochrome P450 proteins catalyse omega- (including omega- 1 ) 
bydroxylanon of prostagla n din A and fany acids such as capnue, laurate, myristate, and patmitate 
(Yoshinrura et aL, J Biochem (Tokyo) 1990 Oct;108(4):544-8). CYP4A4 is elevated owning 
20 pregnancy (Palmer et aL, Arch Biochem Biophys 1 993 Feb 1 -300(2):670-6> 

Matsubara et at.. JBioJChem 1987 Sep 25 £62(27): 13 3 66-71; Yiunamotoe/a/., (1984)7. 
Biochem. (Tokyo) 96, 593-603; Yokotani etaL> Eur J Biochem 1991 Mar 28;l96(3):53l-6; and 
Johnson et at. Biochemistry 1990 Jan 30;29(4):873-9. 

25 Cytochromes, such as the protein provided by the present invention, have many utilities, 

in addition to those described above. Cytochromes not only metabolize normal physiological 
substrates but also neutralize environmental toxins. In addition to oxidizing steroids, fatty acids, 
and foreign compounds in liver cells, cytochromes can also be induced by toxic chemicals, 
pesticides, and canccrogens. 

30 Im m u n ological and PC R- based assays for cytochromes may be used to determine toxicity 

and turnover rate of experimental medicines. Selective cytotoxic drugs can be designed that 
interact with a particular cytochrome and trigger cell death, thereby providing potential new 
treatments for cancer. 

6 
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Cytochromes can generate free radicals (hat cause myocardial cell injury and induce 
endothelial cell damage. In experimental models, alpha-tocopberol and other aim" -oxidants 
suppress generation of free radicals. Glutathione and glutathione peroxidase contribute to natural 
protection against free radical- induced cell damage. Characterization of all cytochromes will 
5 assist development of more efficient anti -oxidants. The sequence provided by the present 
invention can be used to design specific chemopreventive drugs. 

The cytochrome provided herein, as well as other human cytochromes, can be used in a 
high-throughput drug screen to discover anti -parasitic drugs that inhibit non-human oxygenases 
but exhibit no toxicity for the human enzymes. 

10 For a further review of the CYP superfamily, see Igarashi et a!.. Arch Biochem Btophys 

1997 Mar );339(1): 85-91 ; Mod Ua Drugs Ther 2000 Apr 17;42(1076):35-6 (no outhars listed); 
Fowler et at. Biochemistry 2000 Apr 1 8;39{ 1 5): 4406- 1 4; Lamb et aJ., Chem Biol Interact 2000 
Mar 15;125(3): 165-75; Chiba et a\.,Xenobh>t\ca 2000 Feb;30(2):l 17-29; and Meehan et a/.. Am 
J Hum Genet 1988 Jan;42(l):26-37. 

IS The CYP superfamily a major target for drug action and development. Accordingly, it is 

valuable to the field of pharmaceutical development to identify and characterize previously 
unknown members of the CYP superfamily. 

UDP-ghicuiuiw^transf erases 

20 Potential drug interactions involving phase U metabolism ere increasingly being 

recognized. An important group of phase □ enzymes involved in drug metabolism are the 
glucumnosyltransferases, especially the UDP-glucurony {transferase ("UGT") superfamily. 
Members of the UOT superfamily catalyze the enzymatic addition of UDP glucuronic acid as a 
sugar donor to fat-soluble chemicals, a process which increases their solubility in water and 

25 increases their rate of excretion. In mammals, glucuronic acid is the main sugar that is used to 
prevent the accumulation of waste products of metabolism and fat-soluble chemicals from the 
environment to toxic levels in the body. Both inducers and inhibitors of 

gjucuronosyltransf erases are known and have the potential to affect die plasma concentration and 
actions of important drugs, including psychotropic, drugs. 
30 The UGT superfamily comprises several families of enzymes in several species denned 

with a nomenclature similar to that used to define members of the CYP superfamily. In animals, 
yeast, plants and bacteria there are at least 1 10 distinct known members of the UGT superfamily. 
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As many as 33 families have been defined, with three families identified in humans. Different 
UGT families are defined as having <45% amino acid sequence homology; within subfamilies 
there is approximately 60% homology. The members of the UGT superfamUy arc part of a 
further superfamUy of UDP glycosyl transferases found in nnirrnla. plants and bacteria. 
S The role of phase D enzymes, and of UGT enzymes in particular, is being increasingly 

recognized as important in psychopharmacology. UGT enzymes conjugate many important 
psychotropic drugs and are an important source of variability in drug response and drug 
interactions- For example, the benzodiazepines lorazcpam, oxazepam, and temazepam undergo 
phase II reactions exclusively before being excreted into the urine. 

1 0 Phase II enzymes metabolize and detoxify hazardous substances, such as carcinogens. 

The expression of genes encoding phase II enzymes is known to be up- regulated by hundreds of 
agents. For example, oltipraz is known to up-regulate phase II enzyme expression. Studies have 
demonstrated protection from the cancer-causing effects of carcinogens when selected phase II 
enzyme inducers arc administered pnor to the carcinogens. The potential use of phase II enzyme 

1 5 inducers in humans for. prevention of cancers related to exposure to carcinogens has prompted 
studies aimed at understanding their molecular effects. Current biochemical and molecular 
biological research methodologies can be used to identify and characterize selective phase II 
enzyme inducers and their targets. Identification of genes responding to cancer chemopreventive 
agents will facilitate studies of their basic mechanism and provide insights about the relationship 

20 between gene regulation, enzyme polymorphism, and carcinogen detoxification. 

Examples of drugs with conjugarive metabolism associated with UGT enzymes include 
amitriptyline, buprenorphine, chlorpromazine, clozapine, codeine, cyproheptadine, 
dihydrocodcine, doxepin, imipr amine, lamotrigine, lorazcpam, morphine, nalorphine, naltrexone, 
te maz epam, and valproate. 

25 Abnormal activity of phase II enzymes has been implicated in a range of human diseases. 

For example, Gilbert syndrome is an autosomal dominant disorder caused by mutation in the 
UGT1 gene, and mutations in the UGT1 A I enzyme have been demonstrated to be responsible 
for Crigter-Najjar syndrome. 

The UGT superfhmfly a major target for drug action and development Accordingly, it is 

30 valuable to the Geld of pharmaceutical development to identify and characterize previously 
unknown members of the UGT superfamdly. 

Drug-nxtabolizmg enzymes, particularly members of the omega-bydroxyiase cytochrome 
P450 drug- metabolizing enzyme subfamily, are a major target for drug action and development. 

8 
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Accordingly, it is valuable to die field of pharmaceutical development to identify and characterize 
previously unknown members of this subrfnciUy of dYug-aietabolizmg proteins. The present 
invention advances the state of the art by providing b previously unidentified human drug- 
mctabobzing proteins that have homology to members of the ocKga-hydroxylase cytochrome P4S0 
5 drug-metabolizing enzyme subfamily. 

SUMMARY OF THE INVENTION 

The present invention is based in part on the identification of amino arid sequences of 
human drug- metabolizing enzyme peptides and proteins that arc related to the amega- 

10 hydroxylase cytochrome P450 drug-metabolizing enzyme subfamily, as well as allelic variants 
and other mammalian ortbologs thereof These unique peptide sequences, and nucleic acid 
sequences that encode these peptides, can be used as models for the development of human 
therapeutic targets, aid in the identification of therapeutic proteins, and serve as targets for the 
development of human therapeutic agents that modulate drug-metabolizing enzyme activity in 

1 S cells and tissues that express the drug-metabolizing enzyme. Experimental data as provided in 
Figure 1 indicates expression in humans in the stomach, brain (including infant), endometrial 
tumors, prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and 
hepatocellular carcinomas. 

20 DESCRIPTION OF THE FIGURE SHEETS 

FIGURE 1 provides the nucleotide sequence of a cDNA molecule that encodes the drug- 
metabolizing enzyme protein of the present invention. (SEQ ID NO: 1) In addition, structure 
end functional information is provided, such as ATG start, stop and tissue distribution, where 
available, that allows one to readily determine specific uses of inventions based on this 

25 molecular sequence. Experimental data as provided in Figure 1 indicates expression in humans 
in the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland 
tumors, head/neck, sympathetic trunk, breast, and hepatocellular carcinomas. 

FIGURE 2 provides the predicted amino acid sequence of the drug-metabolizing enzyme 
of the present invention. (SEQ ID NO:2) In addition structure and functional information such 

30 as protein family, function, and modification sites is provided where available, allowing one to 
readily determine specific uses of inventions based on this molecular sequence. 

9 
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FIGURE 3 provides genomic sequences that span the gene encoding ihe drug- 
metabolizmg enzyme protein of the present invention. (SEQ ID NO:3) In addition structure and 
functional information, such as intran/exoo structure, promoter location, etc., is provided where 
available, allowing one to readily determine specific uses of inventions based on tins molecular 
5 sequence. As illustrated in Figure 3 , SNPs were identified at 45 different nucleotide positions. 

DETAILED DESCRIPTION OF THE INVENTION 
General Description 

The present invention is based on the sequencing of the human genome. During the 

1 0 sequencing and assembly of the human genome, analysis of the sequence information revealed 
previously unidentified fragments of the human genome thai encode peptides that share 
structural and/or sequence homology to pnrtem/peptidc/dornains identified and characterized 
within the art as bong a drug-metabolizing enzyme protein or part of a drug- metabolizing 
enzyme protein and are related to the omega- hydroxylase cytochrome P450 drug-metabolizing 

I 5 enzyme subfamily. Utilizing these sequences, additional genomic sequences were assembled 
and transcript and/or cDNA sequences were isolated and characterized. Based on this analysis, 
the present invention provides amino acid sequences of human drug-metabolizing enzyme 
peptides and proteins that are related to the omega- hydroxylase cytochrome P450 drug- 
metabolizing enzyme subfamily, nucleic acid sequences in the form of transcript sequences, 

20 cDNA sequences and/or genomic sequences that encode these drug-metabolizing enzyme 
peptides and proteins, nucleic acid variation (allelic information), tissue distribution of 
expression, and information about the closest art known protein'peptide/domain that has 
structural or sequence homology to the drug-metabolizing enzyme of the present invention. 
In addition to being previously unknown, the peptides that are provided in the present 

25 invention are selected based on their ability to be used for the development of commercially 
important products and services. Specifically, the present peptides are selected based on 
homology and/or structural relatedness to known drug-mctabolizing enzyme proteins of the 
omega-hydro xylase cytochrome P450 drug-metabolizing enzyme subfamily and the expression 
pattern observed. Experimental data as provided in Figure 1 indicates expression in humans in 

30 the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland 

tumors, head/neck, sympathetic trunk, breast, and hepatocellular carcinomas. The art has clearly 
established the commercial importance of members of this family of proteins and proteins that 

to 
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have expression patterns similar to thai of the present gene. Some of the more specific features 
of the peptides of the present invention, and the uses thereof, are described herein, particularly in 
the Background of the Invention and in the annotation provided in the Figures, and/or are known 
within the art for each of the known omega-fa ydxoxyiase cytochrome P450 family or subfamily 
S of drug-metabolizing enzyme proteins. 

Peptide Motorics 

The present invention provides nucleic acid sequences that encode protein molecules that 
10 have been identified as being members of the drug-metabolizing enzyme family of proteins and 
are related to the omcga-hydroxylase cytochrome P450 dYug-rnetabolizing enzyme subfamily 
(protein sequences are provided in Figure 2, trans cript/cDNA sequences are provided in Figure 1 
and genomic sequences are provided in Figure 3). The peptide sequences provided in Figure 2, 
as well as the obvious variants described herein, particularly allelic variants as identified herein 
15 and using the information in Figure 3, will be referred herein as the drug-metabolizing enzyme 
peptides of the present invention, dru g-metabolizing enzyme peptides, or peptides/proteins of the 
present invention. 

The present invention provides isolated peptide and protein molecules that consist of, 
consist essentially of, or comprise the amino acid sequences of the drug-metabolizing enzyme 
20 peptides disclosed in the Figure 2, (encoded by the nucleic acid molecule shown in Figure 1 , 
transcript/cDNA or Figure 3, genomic sequence), as well as all obvious variants of these 
peptides that are within the an to make and use. Some of these variants are described in detail 
below. 

As used herein, a peptide is said to be "isolated" or "purified" when it is substantially free 

25 of cellular material or free of chemical precursors or other chemicals. The peptides of the present 

invention can be purified to homogeneity or other degrees of purity. The level of purification will 

be based on the intended use. The critical feature is that the preparation allows for the desired 

function of the peptide, even if in the presence of considerable amounts of other components (the 

features of an isolated nucleic acid molecule is discussed below). 

30 In some uses, " substantially free of cellular material" includes preparations of the peptide 

having less than about 30% (by dry weight) other proteins (i.e., contaminating protein), teas than 

about 20% other proteins, less than about 10% other proteins, or less than about 5% other proteins. 
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When the peptide is recombinantiy produced, h can also be substantially free of cul tu re mwtium 
ix., culture medium represents less than about 20% of the volume of the protein preparation. 

The language "substantially free of chemical precursors or other chemicals" includes 
preparations of the peptide in which it is separated from chemical precursors or other chemicals that 
5 are involved in its synthesis. In one embodiment the language 'substantially free of chemical 
precursors or other chemicals" includes preparations of the drug-metabolizing enzyme peptide 
having less than about 30% (by dry weight) chemical precursors or other chemicals, less than about 
20% chemical precursors or other chemicals, less than about 10% chemical precursors or other 
chemicals; or less than about 5% chemical precursors ox other chemicals. 

10 The isolntnd drug-metnbolizing enzyme peptide can be puriSed from ceils thai naturally 

express it, purifiod from cells that have been altered to express it (recombinant), or synthesized 
using known protein synthesis methods. Experimental data as provided in Figure I indicates 
expression in humans in the stomach, brain (including infant), endometrial tumors, prostate, kidney, 
adrenal gland tumors, head/neck, sympathetic trunk, breast, and hepatocellular carcinomas. For 

IS example, a nucleic acid molecule encoding the drug-metabolizing enzyme peptide i3 cloned into an 
expression vector, the expression vector introduced into a host cell and the protein expressed in the 
host cell. The protein can then be isolated from the cells by an appropriate purification scheme 
using standard protein purification techniques. Many of these techniques are described in detail 
below. 

20 Accordingly, the present invention provides proteins that consist of the ammo acid 

sequences provided in Figure 2 (SEQ ID NO 2), for example, proteins encoded by the 
transcript/cDNA nucleic acid sequences shown in Figure 1 (SEQ ID NO: 1 ) and the genomic 
sequences provided in Figure 3 (SEQ ID N0:3> The amino acid sequence of such a protein is 
provided in Figure 2, A protein consists of an amino acid sequence when the amino acid sequence 

25 is the final amino acid sequence of the protein. 

The present invention further provides proteins that consist essentially of the amino acid 
sequences provided in Figure 2 (SEQ ID NO:2), for example, proteins encoded by the 
transcript/cDNA nucleic acid sequences shown in Figure I (SEQ ID NO: 1) and the genomic 
sequences provided in Figure 3 (SEQ ID NO:3). A protein consists essentially of an amino acid 

30 sequence when such an amino acid sequence is present with only a few additional amino acid 

residues, for example from about I to about 100 or so additional residues, typically from I to about 
20 additional residues in the final protein. 
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The present invention further provides proteins that coraprise the m ^no add sequences 
provided in Figure 2 (SEQ ED N02X for example, proteins encoded by the transcript/cDNA nucleic 
aedd sequences shown in Figure I (SEQ ID NO: 1) and the gmomic sequences provided fa Figure 3 
(SEQ ID NO J). A protein comprises an amino acid sequence when the amino acid sequence is at 
5 least part of the final amino acid segtence of the protein. In such a fashion, the protein can be. only 
the peptide or have additional amino acid molecules, such as amino acid residues (contiguous 
encoded sequence) that are naturally associated with it or heterologous amino acid rcsi dues/pep tide 
sequences. Such a protein can have a few additional amino acid residues or can comprise several 
hundred or more nddirional amino acids. The preferred classes of proteins that are comprised of the 
1 0 drug-metabolizing enzyme peptides of the present invention are the naturally occurring mature 
proteins. A brief description of bow various types of these proteins can be made/isolated is 
provided below. 

The drug-metabolizing enzyme peptides of the present invention can be aQacricd to 
heterologous sequences to form chimeric or fusion proteins. Such chimeric and fusion proteins 

1 5 comprise a drug-metabolizing enzyme peptide operatively linked to a heterologous protein having 
an amino acid sequence not substantially homologous to the drag-metabolizing enzyme peptide. 
"Operatively linked" indicates that the drug-metabolizing enzyme peptide and the heterologous 
protein are fused in- frame. The heterologous protein can be fused to the N-termiaus or C- terminus 
of the drug-metabolizing enzyme peptide. 

20 In some uses, the fusion protein does not affect the activity of the drug-metabolizing enzyme 

peptide per se. For example, the fusion protein can include, but is not limited to, enzymatic fusion 
proteins, for example beta-galootosidase fusions, yeast two-hybrid GAL fusions, poly- His fusions, 
MYC- tagged, Hi-tagged and Ig fusions. Such fusion proteins, particularly poly- His fusions, can 
facilitate the purification of recombinant drug-aictabohzing enzyme peptide. In certain host cells 

25 (e.g., mammalian host cells), expression and/or secretion of a protein can be increased by using a 
heterologous signal sequence. 

A chimeric or fusion protein can be produced by standard recombinant DNA techniques. 
For example, DNA fragments coding for the different protein sequences are ligated together in- 
frame tn accordance with conventional techniques. In another embodiment, die fusion gene can be 

30 synthesized by conventional techniques including automated DNA synthesizers. Alternatively, PCR 
amplification of gene fragments can be carried out using anchor primers which give rise to 
complementary overhangs between two consecutive gene fragments which can subsequently be 
annealed and re-amplified to generate a chimeric gene sequence (see Ausubel el a!., Current 
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Protocols in Moltcndar Biology, 1 992). Moreover, many expression vectors arc commercially 
available tbat already encode a fusion moiety (e.g., a GST protein). A drug-metabolizing enzyme 
peptids-eocoding nucleic acid can be cloned into such an expression vector such that the fusion 
moiety is linked in- frame to the drug-metabolizing enzyme peptide. 
S As mentioned above, the present invention also provides and enables obvious variants of the 

amino acid sequence of the proteins of the present invention, such as naturally occurring mature 
forms of the peptide, allelic/sequence variants of the peptides, non- naturally occurring 
lecombmanth/ derived variants of me peptides, and orthologs and paralogs of the peptides. Such 
variants can readily be generated using art-known techniques in the fields of recombinant nucleic 

10 acid technology and protein biochemistry. It is understood, however, that variants exclude any 
amino acid sequences disclosed prior to the invention. 

Such variants can readily be MeaiifiedAnade using molecular techniques and the sequence 
information disclosed herein. Further, such variants can readily be distinguished from other 
peptides based on sequence and/or structural homology to the drug-metabolizing enzyme peptides 

15 of the present invention. The degree of hoirwlogy/identity present will be based primarily on 
whether the peptide is a functional variant or non-functional variant, the amount of divergence 
present in the peralog family and the evolutionary distance between the orthologs. 

To determine the percent identity of two amino acid sequences or two nucleic acid 
sequences, the sequences are aligned for optimal comparison purposes (e.g. , gaps can be 

20 introduced in one or both of a first and a second amino acid or nucleic acid sequence for optimal 
alignment and non-homologous sequences can be disregarded for comparison purposes). In a 
preferred embodiment, at least 30%, 40%, 50%, 60%, 70%, 80%, or 90% or more of the length 
of a reference sequence is aligned for comparison purposes. The amino acid residues or 
nucleotides at corresponding amino acid positions or nucleotide positions are then compared. 

25 When a position in the first sequence is occupied by the same amino acid residue or nucleotide 
as the corresponding position in the second sequence, then the molecules are identical at that 
position (as used herein amino acid or nucleic acid "identity" is equivalent tD amino acid nr 
nucleic acid "homology*). The percent identity between the two sequences is a function of the 
number of identical positions shared by the sequences, taking into account the number of gaps, 

30 and the length of each gap, which need to be introduced for optimal alignment of the two 
sequenc es . 

The comparison of sequences and determination of percent identity and similarity 
between two sequences can be accomplished using a mathematical algorithm. (Computarianai 
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Molecular Biology, Lesk, AJvl., ed, Oxford University Press, New York, 1988; Biocompufing: 
Informatics aid Genome Projects, Smith, D.W., ed., Academic Pres. New York, 1993; Computer 
Analysis of Sequence Data, Part I, Griffin, A.M., and Griffin, H.G., eda., Humana Press, New 
Jersey, 1994; Sequence Analysis in Molecular Biology, von Heinje, G., Academic Press, 1987; and 
5 Sequence Analysts Primer, Gribskov, M. and Dcvctcux, J., eds., M Stockton Press, New York, 
199 J). In a preferred embodiment, the percent identity between two amino add sequences is 
determined using the Needleman and Wunsch (J. Mol. Biol. (48):444-453 (1970)) algorithm 
which has been incorporated into the GAP program in the GCG software package (available at 
http://www.gcg.com), using either a Blossom 62 matrix or a PAM250 matrix, and a gap weight 

10 of 16, 14, 12,10, 8, 6, or 4 and a length weight of 1,2,3, 4,5, or 6. In yet another preferred 
embodiment, the percent identity between two nucleotide sequences is determined using the 
GAP program io the GCG software package (Devereux, J., et ai., Nucleic Acids fas. 12(1)381 
(1984)) (available at http7Avww.gcg.coin), using a NWSgapdna.CMP matrix and a gap weight of 
40, 50, 60, 70, or 80 and a length weight of 1, 2, 3, 4, 5, or 6. In another embodiment, the 

15 percent identity between two amino acid or nucleotide sequences is determined using the 

algorithm of E. Myers and W. Miller (CABIOS, 4:11-17 (1989)) which has been incorporated 
into the ALIGN program (version 2.0), using a PAM120 weight residue table, a gap length 
penalty of 12 and a gap penalty of 4. 

The nucleic acid and protein sequences of the present invention can further be used as a 

20 "query sequence" to perform a search against sequence databases to, for example, identify other 
family members or related sequences. ' Such searches can be performed using the NBLAST and 
XBLAST programs (version 2.0) of Altschul, et al {J. Mol. Biol. 215:403-10 (1990)). BLAST 
nucleotide searches can be performed with the NBLAST program, score = 100, word length =12 
to obtain nucleotide sequences homologous to the nucleic acid molecules of tbe invention. 

25 BLAST protein searches can be performed with the XBLAST program, score = 50, wordlength = 
3 to obtain amino acid sequences homologous to the proteins of the invention. To obtain gapped 
alignments for comparison purposes. Gapped BLAST can be utilized as described in Altschul et 
al {Nucleic Acids Res. 25(17):3389-3402 (1997)). When utilizing BLAST and gapped BLAST 
programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can 

30 be used. 

Full-length pre-processed forms, as well as mature processed forms, of proteins that 
comprise one of the peptides of the present invention can readily be identified as having complete 
sequence identity to one of the drug-irietabouzing enzyme peptides of the present invention as well 
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as being encoded by the same genetic locus as (be drug- metabolizing enzyme peptide provided 
herein. The gene encoding the novel drug-cnrtaboKzing protein of the present invention is located 
on a genome component tbst has been mapped to human chromosome 1 (as indicated in Figure 3), 
which is supported by multiple lines of evidence, such as STS and BAC map data. 
5 Allelic variants of a drug-metabolizing enzyme peptide can readily he identified as be ing a 

human protein having a high degree (significant) of sequence homology /identity to at least a portion 
of the drug-metabolizing enzyme peptide as well as being encoded by the same genetic locus as the 
drug-metabolizing enzyme peptide provided herein. Genetic locus can readily be determined based 
on the genomic information provided in Figure 3, such as the genomic sequence mapped to the 

1 0 reference human. The gene encoding the novel drug-metabolizing protein of the pr esen t invention 
is located on a genome component that has been mapped to human chromosome 1 (as indicated in 
Figure 3), which is supported by multiple lines of evidence, such as STS and BAC map data. As 
used herein, two proteins (or a region of the proteins} have significant homology when the amino 
acid sequences arc typically at least about 70-80%, 80-90%, and more typically at least about 90- 

15 95% or more homologous. A significantly homologous amino acid sequence, according to the 
present invention, will be encoded by a nucleic acid sequence that will hybridize to a drag- 
metabolizing enzyme peptide encoding nucleic acid molecule under stringent conditions as more 
fully described below. 

Figure 3 provides SNP information that has been found in the gene encoding the drug- 

20 metabolizing proteins of the present invention. SNPs, including insertion/deletion variants 
Cindels**), were identified at 45 different nucleotide positions. Changes in the amino acid 
sequence caused by these SNPs can readily be determined using the universal generic code and 
the protein sequence provided in Figure 2 as a reference. Positioning of each SNP in exons, 
introns, or outside the ORF can readily be determined using the DNA positions given for each 

25 SNP and the start/stop, exon, and intron coordinates given in the features. 

Paralogs of o drug-metabolizing enzyme peptide can readily be identified as having some 
degree of significant sequence homology/identity to at least a portion of the cfnig-awtabolizing 
enzyme peptide, as being encoded by a gene from humans, and as having similar activity or 
function. Two proteins will typically be considered paralogs when the amino acid sequences are 

30 typically at least about 60% or greater, and more typically at least about 70% or greater 

homology through a given region or domain. Such paralogs will be encoded by a nucleic acid 
sequence that will hybridize to a drug-metabolizing enzyme peptide encoding nucleic acid 
molecule under moderate to stringent conditions as more fully described below. 
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Onto logs of a drug-metabolizing enzyme peptide can readily be identified as having some 
degree of significant sequence homology/" dentity to at least a portion of the drug-metabolizing 
enzyme peptide as well as being encoded by a gene from another organism. Preferred orthologs 
will be isolated from mammals, preferably primates, for the development of human therapeutic 
5 targets and agents. Such orthologs will be encoded by a nucleic acid sequence thai will hybridize 
to a drug-metabolizing enzyme peptide encoding nucleic acid molecule under moderate to 
stringent conditions, as more fully described below, depending on the degree of rdaiedness of 
the two organisms yielding the proteins. 

Non-naturally occurring variants of the drug-metabolizing enzyme peptides of the present 

1 0 invention can readily be generated using recombinant techniques. Such variants include, but are not 
limited to deletions, additions and substitutions in the amino acid sequence of the drug-metabolizing 
enzyme peptide. For example, one class of substitutions are conserved amino acid substitution. 
Such substitutions are those mat substitute a given amino acid in a drug-metabolizing enzyme 
peptide by another amino acid of like characteristics. Typically seen as conservative substitutions 

IS arc the replacements, one for another, among the aliphatic amino acids Ala, VaJ, Leu, and lie; 
interchange of the hydroxy! residues Scr and Thr, exchange of the acidic residues Asp and Glu, 
substitution between the amide residues Asn and Gin; exchange of the basic residues Lys and Arg; 
and replacements among the aromatic residues Phe and Tyr. Guidance concerning which amino 
acid changes are likely to be phenotypically silent are found in Bowie «/ a£, Scienct 247: 1 306-13 10 

20 (1990). 

Variant drug -metabolizing enzyme peptides can be fully functional or can lack function in 
one or more activities, e.g. ability to bind substrate, ability to phosphorylate substrate, ability to 
mediate signaling, etc Fully functional variants typically contain only conservative variation or 
variation in son-critical residues or in non-critical regions. Figure 2 provides the result of protein 
25 analysis and can be used to identify critical domains/regions. Functional variants can also co ntain 
substitution of similar amino acids that resuh in no change or an insignificant change in function. 
Alternatively, such substitutions may positively or negatively affect function to some degree. 

Non-functional variants typically contain one or more non-conservative amino acid 
substitutions, deletions, insertions, inversions, or truncation or a substitution, insertion, inversion, or 
30 deletion in a critical residue or critical region. 

Amino acids that are essential for function can be identified by methods known in the art, 
such as site-directed mutagenesis or alanine- scanning mutagenesis (Cunningham et ol. . Science 
244: 1081-1085 (1 989)), particularly using the results provided in Figure 2. The latter procedure 



(99) 



JP 2004-531207 A 2004.10.14 



WO 02/34922 PCT/USO 1/42328 

introduces single "t«ninf mutations at every residue in the molecule. The resulting mutant 
molecules are then tested for biological activity such as drug-metaboiizing enzyme activity or in 
assays such as an in vtiro proliferative activity. Sites that are critical for binding partner/substrate 
binding can also be determined by structural analysis such as crystallization, nuclear magnetic 
5 resonance or photoaffinity labeling (Smith etaL.J. Mot. Biol 22*899-904 ( 1 992); dc Vos ct al. 
Science 255:306-312 (1992)). 

The p rese nt invention further provides fragments of the drug-metabolizing enzyme peptides, 
in addition to proteins and peptides thai comprise and consist of such fragments, particularly those 
comprising the residues identified in Figure 2. The fragments to which (he invention pertains, 
10 however, are not to be construed as encompassing fragments thai may be disclosed publicly prior to 
the present invention. 

As used herein, a fragment comprises at least 8, 10, 12, 14, 16, or more contiguous amino 
acid residues from a drug-metabolizing enzyme peptide. Such fragments can be chosen based on 
die ability to retain one or more of the biological activities of the drug-metabolizing enzyme peptide 

IS or could be chosen for the ability to perform a function, e.g. bind a substrate or an as an 

inununogen. Particularly important fragments arc biologically active fragments, peptides that are, 
for example, about 8 or more amino adds in length Such fragments will typically comprise a 
domain or motif of the drug-metabolizing enzyme peptide, e.g., active site, a transmembrane 
domain or a substrate- binding domain. Further, possible fragments include, but arc not limited to, 

20 domain or motif containing fragments, soluble peptide fragments, and fragments contain mg 

immunogenic structures. Predicted domains and functional cites are readily identifiable by computer 
programs well known and readily available to those of skill in the art (eg., PROSITE analysis). The 
results of one such analysis are provided in Figure 2. 

Polypeptides often contain amino acids other than the 20 amino acids commonly referred to 

25 as the 20 naturally occurring amino acids. Further, many amino acids, including the terminal amino 
acids, may be modified by natural processes, such as processing and other post-translations! 
modifications, or by chemical modification techniques well known in the art Common 
rnodificatkms that occur naturally in dmg- metabolizing enzyme peptides are described in basic 
texts, detailed monographs, and the research literature, and they are well known to those of skill in 

30 the art (some of these features are identified in Figure 2). 

Known modifications include, but are not limited to, acetylation, ary union, ADP- 
ribosylanon, emulation, covalent attachment of flavin, covalent attachment of a heme moiety, 
covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid 
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derivative, covalent attachment of pbcepbo tidy {inositol, cross-linking, cyclization, disulfide bond 
formation, demethytarion, formation of covalent cro s sl ink s, formation of cystine, formation of 
pvroglutamate, fhrraylation. gamma carboxyiatioD, gJycosyJatian, GP1 anchor formation, 
hydro xylatirm, jodlnaiion, mctfaylation. myristoylarion, oxidation, proteolytic processing, 
S phosphorylation, prenylation, raccmizatian, sdenoylation, sulfation, transfer -RNA m^'^r^ 
addition of amino acids to proteins such as arginyialion, and ubiquitiiLalion. 

Such modifications are well known to those of skDJ in the art and have been described in 
great detail in the scientific Literature. Several particularly common modifications, gtycosytarJon, 
lipid attachment, sulfation, gamma-carboxylation of glutamic acid residues, bydroxytanon and 

10 ADP-ribosyiatiou. for instance, are described in most basic texts, such as Proteins - Structure and 
Molecular Properties, 2nd Ed., T.E Creightnn, W. H_ Freeman and Company. New York (1 993). 
Many detailed reviews are available on tins subject, such as by Wold, F., Posttranslational Covalem 
Modification of Proteins, B.C. Johnson. Ed.. Academic Press, New York 1-12(1 983); Softer et aL 
(MetK EnzymoL 182: 626-646 (1990)) and Rattan et oi {Ann. K Y. Acad Sci. o*dJ:48-62 (1992)). 

1 5 Accordingly, the drug-metabolizing enzyme peptides of the present invention also 

encompass derivatives or analogs in which a substituted amino acid residue is not cue encoded by 
the genetic code, in which a sufastituent group is i n cl u ded, in which the mature drug-metabolizing 
enzyme peptide is fused with another compound, such as a compound to increase the half-life of the - 
drug- metabolizing enzyme peptide (for example, polyethylene glycol), or in which the additional 

20 amino adds etc fused to the mature drug-mctaboludng enzyme peptide, such as a leader or secretory 
sequence or a sequence for puriGcatioo of the mature chug-mctaboiiziug enzyme peptide or a pro- 
protein sequence. 

Protein/Pontidc Uses 

25 The proteins of the present invention can be used in substantial and specific assays 

related to the functional information provided in the Figures; to raise antibodies or to elicit 
another immune response; as a reagent (including the labeled reagent) in assays designed to 
quantitatively determine levels of the protein (or its binding partner or ligand) in biological 
fluids; and as markers for tissues in which the corresponding protein is preferentially expressed 

30 (cither constitute vcly or at a particular stage of tissue differentiation or development or in a 

disease state). Where the protein binds or potentially binds to another protein or ligand (such as, 
for example, in a drug-metabolizing enzyme-effector protein interaction or drug -metabolizing 
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enzyme-Iigand interaction), the protein can be used to identify the binding partnez/ligand so as to 
develop a system to identify inhibitors of the binding interaction. Any or ail of these uses are 
capable of being developed into reagent grade or kit format for commercialization as commercial 
products. 

5 Methods for performing the uses listed above are well known to those skilled in the art 

References disclosing such methods include "Molecular Cloning: A Laboratory Manual \ 2d ed., 
Cold Spring Harbor Laboratory Press, Sambrook, J., E. F. Fritsch and T. Maniatis eds., 1 989, 
and "Methods in Enzymology: Guide to Molecular Cloning Techniques", Academic Press, 
Bergcr, S. L. and A. FL Kimmel eds., 1987. 

10 The potential uses of the peptides of the present invention are based primarily on the 

source of the protein as well as the class/action of the protein. For example, drug-metabolizing 
enzymes isolated from humans and their mmuui/mammalian ortbologs serve as targets for 
identifying agents for use in mammalian therapeutic applications, e.g. a human drug, particularly 
in modulating a biological or pathological response in a cell or tissue that expresses the drug- 

1 5 metabolizing enzyme. Experimental data as provided in Figure 1 indicates that drug- 
metabolizing enzyme proteins of the present invention are expressed in humans in the stomach, 
brain (including infant), endometrial tumors, prostate, kidney, adrenal gland rumors, head/neck, 
sympathetic trunk, breast, and hepatocellular carcinomas, as indicated by virtual northern blot 
analysis. PCR-based tissue screening panels also indicate expression in the brain. A large 

20 percentage of pharmaceutical agents arc being developed thai modulate the activity of drug- 
metabolizing enzyme proteins, particularly members of the omcga-hydroxylase cytochrome 
P450 subfamily (see Background of the Invention). The structural and functional information 
provided in the Background and Figures' provide specific end substantial uses for the molecules 
of the present invention, particularly in combination with the expression information provided in 

25 Figure 1 . Experimental data as provided in Figure 1 tndtcfltrs expression in humans in the 

stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland tumors, 
head/neck, sympathetic trunk, breast, and hepatocellular carcinomas. Such uses can readily be 
determined using the information provided herein, that which is known tn the art, and routine 
experimentation. 

30 The drug-metabolizing enzyme polypeptides (including variants and fragments that may 

have been disclosed prior to the present invention) are useful for biological assays related to drug- 
metabolizing enzymes thai are related to members of the omega- hydroxylase cytochrome P450 
subfamily. Such assays involve any of the known drug-metabolizing enzyme functions or activities 
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or properties useful for diagnosis end treatment of drug-metabolizing enzyme -related conditions 
(hat ore specific for the subfamily of drug-metabolizing enzymes that the one of the p resent 
invention belongs to, particularly in cells and tissues that express the drug-metabolizing e nz yme. 
Experimental data as provided in Figure 1 indicates that drug-rnetabolizmg enzyme proteins of the 
S present invention are expressed in Iranians in the stomach, brain (including infant), endometrial 
tumors, prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and 
hepatocellular carcinomas, as indicated by virtual northern blot analysis. FCR-based tissue 
screening panels also indicate expression in the brain. 

The drug-metabolizing enzyme polypeptides arc qLso useful in drug screening assays, in 

1 0 cell-cased or cell-free systems. Cell-based systems can be native, i.e., cells that normally express 
the drug-metabolizing enzyme, as a biopsy or expanded in cell culture. Experimental data as 
provided in Figure 1 indicates ex p res si on in humans in the stomach, brain (including infant), 
endometrial tumors, prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, 
and hepatocellular carcinomas. In an alternate embodiment, cell-based assays involve recombinant 

1 5 host cells expressing the drug-metabolizing enzyme protein. 

The polypeptides can be used to identify compounds that modulate drug-metabolizing 
enzyme activity of the protein in its natural state or an altered form that causes a specific disease or 
pathology associated with the dmg-metabolizing enzyme. Both the drug-metabolizing enzymes of 
the present invention and appropriate variants and fragments can be used in high-throughput screens 

20 to assay candidate compounds for (he ability to bind to the drug-metabolizing enzyme. These 

compounds can be further scr e en ed against a functional drug-metabolizing enzyme to determine the 
effect of the compound on the drug<netabolizing enzyme activity. Further, these compounds can 
be tested in animal or invertebrate systems to determine actj vity/ eflccti veness. Compounds can be 
identified that activate (agonist) or inactivate (antagonist) the drug-metabolizing enzyme to a 

25 desired degree. 

Further, the drug-metabolizing enzyme polypeptides can be used to screen a compound for 
the ability to stimulate or inhibit interaction between the drug-metabolizing enzyme protein and a 
molecule that normally interacts with the drug-metabolizing enzyme protein. Such assays typically 
include the steps of combining the drug-metabolizing enzyme protein with a candidate compound 
30 under conditions that allow the drug- metabolizing enzyme protein, or fragment, to interact with the 
target molecule, and to detect the formation of a complex between the protein and the target or to 
detect the biochemical consequence of the interaction with the drug-metabolizing enzyme protein 
and the target 
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Candidate compounds include, for example, 1) peptides such as soluble prpririer including 
Ig-tajicd fusion peptides and members erf* ran dom peptide libraries (see, e.g., Lam ef aL , N court 
5.5*82-84 (1991); Houghten et aL. Nature 554:84-86 (1 991)) and combinatorial chemistry-derived 
molecular libraries made of D~ and/or L- configuration amino acids; 2) phospbo peptides (eg., 
5 members of random and partially degenerate, directed pbosphopeptide libraries, see, e.g., Songyang 
etaL, Cell 72:767-778 (1993)); 3) antibodies (e.g., polyclonal, monoclonal, humanized, anti- 
idiotypic, chimeric, and single chain antibodies as weU as Fab, F(ab'~b. Fab expr e ssi on library 
fragments, and epitope- binding fragments of antibodies); and 4) small organic and inorganic 
molecules (e.g., molecules obtained from combinatorial and natural product libraries). 

10 One candidate compound is a soluble fragment of the rco cp t ot that co mp ete s for substrate 

binding. Other candidate compounds include mutant drug-metabolizing enzymes or appropriate 
fragments containing mutations that affect drug-metabolizing enzyme function and thus compete for 
substrate. Accordingly, a fragment that competes for substrate, for example with a highw affinity, 
or a fragment that binds substrate but does not allow release, is encompassed by the invention. 

1 5 Any of the biological or biochemical functions mediated by the drug- metabolizing enzyme 

can be used as an endpoint assay. These include all of the biochemical or biochemical/biological 
events described herein, in the references cited herein, incorporated by reference for these endpoint 
assay targets, and other functions known to those of ordinary stall in the art or that can be readily 
identified using me information provided in the Figures, particularly Figure 2. Specifically, a 

20 biological function of a cell or tissues that expresses the chug-metabolizing enzyme can be assayed. 
Experimental data as provided in Figure 1 indicates that diug-metabouzmg enzyme proteins of the 
present invention are expressed in humans in the stomach, brain (including infant), endometrial 
tumors, prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and 
hepatocellular carcinomas, as indicated by virtual northern blot analysis. FCR-based tissue 

25 screening panels also indicate expression in the brain. 

Binding and/or activating compounds can also be scr ee n ed by using chimeric drug- 
metabolizing enzyme proteins in which the amino terminal extracellular domain, or parts thereof, 
the entire transmembrane domain or subrcgjons, such as any of the seven transmembrane segments 
or any of the intracellular or extracellular loops and the carboxy terminal intracellular domain, or 

30 parts thereof, can be replaced by heterologous domains or subregions. For example, a substrate- 
binding region can be used that interacts with a different substrate then that which is recognized by 
the native chug-metabolizing enzyme. Accordingly, a different set of signal transduction 
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components is available as an end-point assay for activation. This allows for assays to be performed 
in other than the specific host cell from which the drug-metabolizing enzyme is derived. 

The drj^-metabolizing enzyme polypeptides are also useful in competition binding assays 
in methods designed to discover compounds that interact with the drugHXKtatolrang enzyme (e.g. 
S binding partners and/or Uganda). Thus, a compound is exposed to a drug-metabolizing enzyme 
polypeptide under conditions that allow the compound Co bind or to otherwise interact with the 
polypeptide. Soluble drug-rnetabohzing enzyme polypeptide is also added to the mixture. If the 
test compound interacts with the soluble drug-metabolizmg enzyme polypeptide, it decreases the 
amount of complex farmed or activity from the dVug-tnetabolrTing enzyme target- This type of 
10 assay is particularly useful in cases in which compounds are sought that interact with specific 
regions of the drug-metrtbolizing enzyme. Thus, the soluble polypeptide that competes with the 
target chug -metabolizing enzyme region is designed to contain peptide sequences corresponding to 
the region of interest. 

To perform ceil free drug screening assays, it is sometimes desirable to immobilize either 

15 the (Jrug- metabolizing enzyme protein, or fragment, or its target molecule to facilitate separation of 
complexes from uncarnplexed forms of one or both of the proteins, as well as to accommodate 
automation of the assay. 

Techniques for immobilizing proteins on matrices can be used in the drug s creeni ng assays. 
In one embodiment, a fusion protein can be provided which adds a domain that allows the protein to 

20 be bound to a matrix. For example, ghnamione-3-ttansferase fusion proteins can be adsorbed onto 
glutathione scpharose beads (Sigma Chemical, St Louis, MO) or glirtalhiuue dematized microti tie 
plates, which are then combined with the cell lysates (eg., ^S- Labeled) and the candidate 
compound, and the mixture incubated under conditions conducive to complex formation (e.g-, at 
physiological conditions for salt and pH). Following incubation, the beads cue washed to remove 

25 any unbound label, and the matrix immobilized and radiolabel determined directly, or in the 

supernatant after the complexes are dissociated- Alternatively, the complexes can be dissociated 
from the matrix, separated by SDS-PAGE, and the level of drug-metabolizing enzyme- bin ding 
protein found in the bead fraction quantitated from the gel using standard electrophoretic 
techniques. For example, either the polypeptide or its target molecule can be immobilized utilizing 

30 conjugation of biotin and streptavidin using techniques well known in the art Alternatively, 
antibodies reactive with the protein but which do not interfere with binding of the protein to its 
target molecule can be derivatized to the wells of the plate, and the protein trapped in the wells by 
antibody conjugation. Preparations of a drug-metabolizing enzyme-binding protein and a candidate 
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compound are incubaied in the drug-metabolizing enzyme proteio-prcsenling weus and the amount 
of complex (rapped in the well ens be quaon'toted. Methods for detecting such complexes, in 
addition to those described above for the GST-immobilized complexes, include unmmx>detsction of 
complexes using antibodies reactive with the drug -metabolizing enzyme protein target molecule, or 
S which are reactive with chug-metabolizing enzyme protein and compete with the target molecule, as 
well as enzyme-linked assays which rely on detecting an enzymatic activity associated with the 
target molecule 

Agents that modulate one of (his drug-metabolizing enzymes of the present invention can be 
identified using one or more of the above assays, alone or in combination, it is generally preferable 

10 to use o cell-based or cell free system first and (hen confirm activity in an animal or other model 

system. Such m o del systems ore well known in the art end can readily be employed in this context. 

Modulators of drug-metabolizing enzyme protein activity identified according to these drug 
screening assays can be used to treat a subject with o disorder mediated by the drug-metabolizing 
enzyme pathway, by treating cells or tissues that express the drug -metabolizing enzyme. 

IS Experimental data as provided in Figure I indicates e xpr es si on in humans in me stomach, brain 
(including infant), endometrial tumors, prostate, kidney, adrenal gland rumors, head/neck, 
sympathetic trunk, breast, and hepatoccQular carcinomas. These methods of treatment include the 
steps of adimnisteruTg a modulator of aYug-mctabolizing enzyme activity in a phamjaceutical 
composition to a subject in need of such Treatment, the modulator being identified as described 

20 herein. 

In yet another aspect of the invention, the drug-metabohzing enzyme proteins can be 
used as "bait proteins" in a two-hybrid assay or three-hybrid assay (see, e.g., U.S. Patent No. 
5,283,3 1 7; Zervos et al (1993) Cell 72:223-232; Madura et al (1993) J. Biol. Ckem. 268: 12046- 
12054; Bartel er al. (1993) Biotechniqucs 14:920-924; Iwabuchi et al. (1993) Oncogene 8:1693- 

25 1 696; and Brent W094/1 03 00), to identify other proteins, which bind to or interact with the 
drug-metabolizing enzyme and are involved in drug-metabolizing enzyme activity. Such drug- 
metabolizing enzyme-binding proteins are likely to be drug-metabolizing enzyme inhibitors. 

The two-hybrid system is based on the modular nature of most transcription factors, 
which consist of separable DNA-btnding and activation domains. Briefly, the assay utilizes two 

30 different DNA constructs. In one construct, the gene that codes for a drug-metabolizing enzyme 

protein is fused to a gene encoding the DNA binding domain of a known transcription factor 

(e.g., OAL-4). In the other construct, a DNA sequence, from a library of DNA sequences, that 

encodes an unidentified protein ("prey" or "sample") is fused to a gene that codes for the 
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activation domain of the known transcription factor. If the "bait" end the "prey" proteins ere able 
to interact, »it vivo, forming a drug-metabolizing enzyme-dependent complex, the DNA-bioding 
and BCtivBtion domains of the transcription factor are brought into close proximity. This 
proximity allows transcription of a reporter gene (eg., LacZ) which is opcrably Mr 1 ^^ to a 
5 transcriptional regulatory site responsive to the transcription factor. Expression of the reporter 
gene can be detected and cell colonies containing the functional transcription factor can be 
isolated and used to obtain the cloned gene which encodes the protein which interacts with the 
drug-metabotiring enzyme protein. 

This invention further pertains to novel agents identified by the above-described 

10 sc reeni ng assays. Accordingly, it is within the scope of this invention to further use an agent 
inenrifind as described herein in an a ppropri ate animal m odel. Par example, an agent identified 
as described herein (e.g., a drug-metabolizing enzyme-modulating agent, an anti sense drug- 
metabolizing enzyme nucleic acid molecule, a drug-metabolizing enzyme-specific antibody, or a 
drug-metabolizing enzyme-binding partner) can be used in an animal or other model to 

1 5 determine the efficacy, toxicity, or side effects of treatment with such an agent. Alternatively, an 
agent identified as described herein can be used in an animal or other model to determine the 
mechanism of action of such an agent. Furthermore, this invention pertains to uses of novel 
agents identified by the above-described screening assays for treatments as described herein. 

The drug-metabolizing enzyme proteins of the present invention are also useful to provide a 

20 target for diagnosing a disease or predisposition to disease mediated by the peptide. Accordingly, 
the invention provides methods for detecting the presence, or levels of, the protein (or encoding 
mRNA) in a cell, tissue, or organism. Experimental data as provided in Figure 1 indicates 
expression in humans in the stomach, brain (including infant), endometrial tumors, prostate, kidney, 
adrenal gland tumors, bead/neck, sympathetic trunk, breast, and hepatocellular OTOinomas. The 

25 method involves competing a biological sample with a compound capable of interacting with the 
drug-metabolizing enzyme protein such that the interaction can be de t ected. Such an assay can be 
provided in a single detection format or a multi-detection format such as an antibody chip array. 

One agent for detecting a protein in a sample is an antibody capable of selectively binding to 
protein. A biological sample includes tissues, cells and biological fluids isolated from a subject, as 

30 well as tissues, cells and fluids present within a subject. 

The peptides of the present invention also provide targets for diagnosing active protein 
activity, disease, or predisposition to disease, in a patient having a variant peptide, particularly 
activities and conditions that are known for other members of the family of proteins to which the 
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present one belongs. Thus, the peptide can be isolated from a biological sample and assayed for the 
presence of n genetic imitation that results in abaianl peptide. This includes amino acid 
substitution, deletion, insertion, rearrangement, (as the result of aberrant splicing events), and 
inappropriate post-trenslatkmal modification. Analytic methods include altered dectropharrtic 
S mobility, altered tryptic peptide digest, altered dnig -metabolizing enzyme activity in cell-based or 
ceU-lrcs^isay, alteration in substrate or antibody-binding pattern, altered isoelectric point, direct 
amino acid sequencing, and any other of the known assay techniques useful for detecting mutations 
in a protein. Such an assay can be provided in a single detection format or a mnra'-detectian format 
such as an antibody chip array. 

10 In vitro techniques for detection of peptide include enzyme linked immunosorbent assays 

(ELI S As), Western blots, immuDOprccipiia lions and immunofluorescence using a detection reagent, 
such as an antibody or protein binding agent. Alternatively, the peptide can be detected in vivo in a 
subject by introducing into the subject a labeled anti-peptide antibody or other types of detection 
agent. For example, the antibody can be labeled with a radioactive marker whose presence and 

1 5 location in a subject can be detected by standard imaging techniques. Particularly useful are 

methods that detect the allelic variant of a peptide expressed to a subject and methods which detect 
fragments of a peptide in a sample. 

The peptides are also useful in pharmacogenomic analysis. Pfuirmacogertomics deal with 
clinically significant hereditary variations in the response to drugs due to altered drag disposition 

20 and abnormal action in affected persons. See, e.g., Eicbelbaum, M (Clin. Exp. Pharmacol Physiol, 
23(10-1 l):983-985 (1996)), and Under, M.W. [Clin. Chem. 43(2):234-266 (1997)). The clinical 
outcomes of these variations result in severe toxicity of therapeutic drugs in certain individuals or 
therapeutic failure of drugs in certain individuals as a result of individual variation in metabolism. 
Thus, the genotype of the individual can determine the way a therapeutic compound acts on the 

25 body or the way the body metabolizes the compound. Further, the activity of drug metabolizing 
enzymes effects both the intensity and duration of drug action. Thus, the pharmacogenomics of the 
individual permit the selection of effective oornpounds and effective dosages of such compounds for 
prophylactic or therapeutic treatment based on the individual's genotype. The discovery of genetic 
polymorphisms in some drug metabolizing enzymes has explained why some patients do not obtain 1 

30 the expected drug effects, show an exaggerated drag effect, or experience serious toxicity from 
standard drug dosages. Polymorphisms can be expressed in the phenotype of the extensive 
mctabohzer and the phenotype of the poor metabolizex. Accordingly, genetic polymorphism may 
lead to allelic protein variants of the drug-metabolizing enzyme protein in which one or more of the 
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drug-metabolizing enzyme functions in one porrulation is different from those in another population. 
The peptides this aOow a targe t to ascertain a genetic predisposition that can affect treatment 
modality. Thus, in a bgand-based treatment, polymorphism may give rise to amino terminal 
extracellular domains and/or other substrate-binding regions mat are more or less active in substrate 
S binding, and drag-metabolizing enzyme activation. Accordingly, substrate dosage would 

necessarily be modified to maximize the therapeutic effect within a given population containing a 
polymorphism. As an alternative to geno typing, specific polymorphic peptides could be identified 

The peptides are also useful for treating a disorder charnci raized by an absence of, 
inappropriate, or unwanted expression of the protein. Experimental data as provided in Figure 1 
1 0 indicates expression in humans in the stomach, brain (including infant X endometrial tumors, 

prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and hepatocellular 
carcinomas. Accordingly, methods for treatment include the use of the drug-metabolizing enzyme 
protein or fragments. 

IS Antibodies 

The invention also provides antibodies that selectively bind to one of the peptides of the 
present invention, a protein comprising such a peptide, as well as variants end fragments thereof. 
As used herein, an antibody selectively binds a target peptide when it binds the target peptide and 
does not ognificantly bind to unrelated proteins. An antibody is still considered to selectively bind 

20 a peptide even if it also binds to other proteins that are not substantially homologous with the target 
peptide so long as such proteins share homology with a fragment or domain of the peptide target of 
the antibody. In this case, h would be understood that antibody binding to the peptide is still 
selective despite some degree of cross- reactivity . 

As used herein, an antibody is defined is terms consistent with that recognized within the 

25 art: they are multi-subunit proteins produced by a rnamrnalian organism in response to an antigen 
challenge. The antibodies of the present invention include polyclonal antibodies and monoclonal 
antibodies, as well as fragments of such antibodies, including, but not limited to, Fab or F(ab'>2, and 
Fv fragments. 

Many methods are known for generating and/or identifying antibodies to a given target 
30 peptide. Several such methods axe described by Harlow, Antibodies. Cold Spring Harbor Pre», 
(1989). 
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In general, to generate antibodies, en isolated peptide is used as an immunogen and b 
administered to a mammalian organism, such as a rat, rabbit or mouse. The full-length protein, an 
antigenic peptide fragment or a fusion protein con be used. Particularly important fragments are 
those cov eri ng functional domains, such as the domains identified in Figure 2. and domain of 
5 sequ e nce homology or divergence amongst the family, such as those that can readily be identified 
using protein aligjuiiqjt methods and as p " - **"^ in the Figures. 

Antibodies ore preferably prepared from regions or discrete fragments of the drug- 
metabolizing enzyme proteins. Antibodies can be prepared from any region of the peptide as 
described herein. However, preferred regions will include those involved in function/activity 

10 and/or drug -metabolizing enzyme/binding partner interaction. Figure 2 can be used to identify 
particularly important regions while sequence alignment can be used to identify conserved and 
unique sequence fragments. 

An antigenic fragment will typically comprise at least 8 contiguous amino acid residues. 
The antigenic peptide can comprise, however, at least 10, 12, 14, 16 or more amino acid residues. 

I S Such fragments can be selected on a physical property, such as fragments correspond to regions that 
arc located an the surface of the protein, e.g., hydrophilic regions or can be selected based on 
sequence uniqueness (see Figure 2). 

Detection on an antibody of the present invention can be facilitated by coupling (i.e., 
physically linking) the antibody to a detectable substance. Examples of detectable substances 

20 include various enzymes, prosthetic groups, fluorescent materials, luminescent materials, 
bioluminescent materials, and radioactive materials. Examples of suitable enzymes include 
horseradish peroxidase, alkaline phosphatase, P-galactosidasc, ox acetylcholinesterase; examples of 
suitable prosthetic group complexes include streptavidin/biotin and avidin/bioun; examples of 
suitable fluorescent materials include umbel liferone, fluorescein, fluorescein isothiocyanate, 

25 rhodarnine, dkhlorotriaziny lam ine fluorescein, dansyl chloride or phycoerythrin; an example of a 
luminescent material includes rumiool; examples of bioluminescent materials include luciferase, 
ruciferin, and eequorin, and examples of suitable radioactive material include °\ a \ 3I S or *H. 

Antibody Uses 

30 The antibodies can be used to isolate one of the proteins of the present invention by standard 

tec hni ques, such as affinity chromatography or imraunoprecipitarion. The antibodies can facilitate 
the purification of the natural protein from cells and rccombinantfy produced protein expressed in 
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host cells. Id addition, such antibodies are useful to detect the presence of one of the proteins of the 
p re sent invention in cells or tissues to determine the pattern of expression of the protein among 
various tissues in an organism and over the course of normal development. Experimental data as 
provided in Figure 1 indi cotes that drug-metnbolizing enzyme proteins of the present invention are 

5 e xpressed in humans in the stomach, brain (including infant), endometrial tumors, prostata, kidney, 
adrenal gland tumors, bead/neck; sympathetic trunk, breast, and hepatocellular rffrpinom?^, as 
indicated by virtual northern blot analysis. PCR -based tissue screening panels also indicate 
expression in the brain. Further, such antibodies can be used to detect protein in situ, in vitro, or in 
a cell lysate or supernatan t in order to evaluate the abundance and pattern of expression. Also, such 

10 antibodies can be used to assess abnormal tissue distribution or abnormal exp r ession during 

development or progression of a biological condition. Antibody detection of circulating fragments 
of die full length protein can be used to identify turnover. 

Further, the antibodies can be used to assess expression in disease stales such as in active 
stages of the disease or in an individual with a predisposition toward disease related to the protein's 

IS function. When a disorder is caused by an inappropriate tissue distribution, developmental 

expression, level of expression of the protein, or expressed/processed form, the antibody can be 
prepared against the normal protein. Experimental data as provided in Figure 1 indicates expression 
in humans in the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal 
gland tumors, bead/neck, sympathetic trunk, breast, and hepatocellular carcinomas. If a disorder is 

20 characterized by a specific mutation in the protein, antibodies specific for this mutant protein can be 
used to assay for the presence of the specific mutant protein. 

The antibodies can also be used to assess normal and aberrant subcellular localization of 
cells in the various tissues in an organism. Experimental data as provided in Figure 1 indicates 
expression in humans in me stomach, brain (including infant), endometrial tumors, prostate, kidney, 

25 adrenal gland tumors, bead/neck, sympathetic trunk, breast, and hepatocellular carcinoniaa The 
diagnostic uses can be applied, not only in genetic testing, but also in monitoring a treatment 
modality. Accordingly, where treatment is ultimately aimed at correcting expression level or the 
presence of aberrant sequence and aberrant tissue distribution or developmental expression, 
antibodies directed against the protein or relevant fragments can be used to monitor therapeutic 

30 efficacy. 

Additionally, antibodies arc useful in pharmacogenomic analysis. Thus, antibodies prepared 
against porymorphic proteins can be used to identify individuals that require modified treatment 
modalities. The antibodies are also useful as diagnostic tools as an immunological marker for 
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aberrant protein analyzed by elecfxophcrctic mobility, isoelectric point, tryptic peptide digest, and 
other physical essays known to those in the art 

The antibodies are also useful for tissue typing. Experimental data as provided in Figure 1 
indicates expression in humans in the stomach, brain (including infant), endometrial tumors, 
S prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and hepatocellular 
card nomas. Thus, where a specific protein has been correlated with expression in a specific tissue, 
antibodies that are specific for this protein can be used to identify a tissue type. 

The antibodies are also useful for inhibiting protein function, for example, blocking the 
binding of the dTug-mctaboHring enzyme peptide to a binding partner such as a substrate These 
uses can also be applied in a therapeutic context in which treatment involves inhibiting the protein's 
function. An antibody can be used, for example, to block binding, thus modulating (agonizing or 
antagonizing) the peptides activity. Antibodies can be prepared against specific fragments 
containing sites required for function or against intact protein that is associated with a celt or cell 
membrane. See Figure 2 for structural information relating to the proteins of (he present invention. 

■ 

15 The invention also encompasses kits for using antibodies to detect the presence of a protein 

in a biological sample. The kit can comprise antibodies such as a labeled or labelable antibody and 
a compound or agent for ri*twJing protein in a biological sample; m ea n s for d e to rnrntn g the amount 
of protein in the sample; means for comparing the amount of protein in the sample with a standard; 
and instructions for use. Such a kit can be supplied to detect a single protein or epitope or can be 

20 configured to detect one of ft muKirade of epitopes, such as in an antibody detection array. Arrays 
are described in detail below for nucleic acid arrays and similar methods have been developed for 
antibody arrays. 

Nwteic A<Mri Molecules 

25 The present invention further provides isolated nucleic acid molecules that encode a drug- 

metabolizing enzyme peptide or protein of the present invention (cDNA, transcript and genomic 
sequence). Such nucleic acid molecules will consist of, consist essen ti ally of, or comprise a 
nucleotide sequence that encodes one of the drug-metabolizing enzyme peptides of the present 
invention, an allelic variant thereof, or an ortholog or paralog thereof 

30 As used herein, an "isolated" nucleic acid molecule Is one that is separated from other 

nucleic acid present in the natural source of the nucleic acid. Preferably, en "isolated" nucleic acid 

is free of sequences that naturally flank the nucleic acid (Le., sequences located at the 5* and 3' ends 
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of the nucleic acid) in the genomic DNA of (he organism from which the rr"drir acid is derived. 
However, there can be some flanking nucleotide sequences, for example up to about 5KB, 4KB, 
3KB, 2KB, or 1KB or less, particularly contiguous peptide encoding sequences and peptide 
encoding «rnrn"*« within the same gene but separated by introns in the genomic sequence. The 
5 important point b that die nucleic acid is isolated from remote and unimportant flmVing sequences 
such that it can be subjected to- the specific manipulations described herein such as recombinant 
expression, preparation of probes and primers, and other uses specific to the nucleic acid sequencea 
Moreover, an "isolated" nucleic acid molecule, such as a transcript/cDNA molecule, can be 
substantially free of other cellular material, or culture medium when produced by recombinant 
10 techniques, or chemical precursors or other chemicals when chemically synthesized However, the 
nucleic acid molecule can be fused to other coding or regulatory sequences and still be considered 
isolated. 

For example, recombinant DNA molecules contained in a vector ore considered isolated. 
Further examples of isolated DNA molecules include recombinant DNA molecules maintained in 

IS heterologous bost cells or purified (partially or substantially) DNA molecules in solution. Isolated 
RNA molecules include m vivo or in vitro RNA transcripts of the isolated DNA molecules of the 
present invention. Isolated nucleic acid molecules according to the present invention further include 
such molecules produced synthetically. 

Accordingly, the present invention provides nucleic acid molecules that consist of the 

20 nucleotide sequence shown in Figure 1 or 3 (SEQ ID NO: 1, transcript sequence and SEQ ID NO J, 
genomic sequence), or any nucleic acid molecule that encodes the protein provided in Figure 2, 
SEQ ID NO:2. A nucleic acid molecule consists of a nucleotide sequence when the nucleotide 
sequence is the complete nucleotide sequence of the nucleic acid molecule. 

The present invention further provides nucleic acid molecules that consist essentially of the 

25 nucleotide sequence shown in Figure 1 or 3 (SEQ ID NO: 1 , transcript sequence and SEQ ID NO J, 
genomic sequence), or any nucleic acid molecule that encodes the protein provided in Figure 2, 
SEQ ID NO:2. A nucleic acid molecule consists essentially of a nucleotide sequence when such a 
nucleotide sequence is present with only a few additional nucleic acid residues in the final nucleic 
acid molecule. 

30 The present invention further provides nucleic acid molecules that comprise the nucleotide 

sequences shown in Figure ] or 3 (SEQ ID NO: 1 , transcript sequence and SEQ ID NO:3, genomic 
sequence), or any nucleic acid molecule that encodes the protein provided in Figure 2, SEQ ID 
NO:2. A nucleic acid molecule comprises a nucleotide sequence when the nucleotide sequence is at 
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least part of the final nucleotide sequence of the nucleic acid molecule. In such a fashion, the 
nucleic acid molecule can be only the nucleotide sequence or have additional nucleic acid residues, 
such as nucleic acid residues that are naturally associated with it or heterologous nucleotide 
sequences. Such a nucleic acid molecule can have a few additional nucleotides or can comprises 
S several hundred or more additional nucleotides. A brief description of how various types of these 
nucleic acid molecules can be readily made/isolated is provided below. 

In Figures 1 and 3, both coding and non-coding sequences arc provided. Because of the 
source of the present invention, humans genomic sequence (Figure 3) and cDNA/trenseript 
sequences (Figure 1), the nucleic acid molecules in the Figures will contain genomic Intronic 

1 0 sequences, 5' and 3' non-coding sequences, gene regulatory regions and non-coding intergenic 
sequences. In general such sequence features are either noted in Figures \ and 3 or can readily 
be identified using computational tools known in (he art. As discussed below, some of the non- 
coding regions, particularly gene regulatory elements such as promoters, are useful for a variety 
of purposes, e.g. control of heterologous gene expression, target An identifying gene activity 

15 modulating compounds, and are particularly claimed as fragments of the genomic sequence 
provided herein. 

The isolated nucleic acid molecules can encode the mature protein plus additional amino or 
carboxyl -terminal amino acids, or amino acids interior to the mature peptide (when the mature form 
has more than one peptide chain, for instance). Such sequences may play a role in processing of a 

20 protein from precursor to a mature farm, facilitate protein traffi ck i ng , prolong or shorten protein 
half-life or facilitate manipulation of a protein for assay or production, among other things. As 
generally is die case in situ, the additional amino acids may be processed away from Che mature 
protein by cellular enzymes. 

As mentioned above, the isolated nucleic acid molecules include, but are not limited to, the 

25 sequence encoding the drug-metabolizing enzyme peptide alone, the sequence encoding me mature 
peptide and additional coding sequences, such as a leader or secretory sequence (eg, a pre -pro or 
pro-protein sequence), the .sequence encoding the mature peptide, with or without the additional 
coding sequences, phis additional non-coding sequences, for example mtrons and non-coding 5' end 
3' sequences such as transcribed but non-translated sequences that play a role in transcription, 

30 mRNA processing (including spacing and polyadenyiation signals), ribosome binding and stability 
of mRNA In addition, the nucleic acid molecule may be' fused to a marker sequence encoding, for 
example, a peptide thai facilitates purification. 
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Isolated nucleic acid molecules can be in the form of RNA, such as mRNA, or in the form - 
DNA, including cDNA and genomic DNA obtained by cloning or produced by chr mi^ al synthetic 
tec h niques or by a combination thereof The nucleic acid, especially DNA, can be double-stranded 
or single-stranded. Single-stranded nucleic acid can be the coding strand (sense strand) or the non- 
5 coding strand (ami-sense strand). 

The invention further provides nucleic acid molecules that encode fragments of the peptides 
of the presen t invention as well as nucleic acid molecules that encode obvious variants of the drug- 
mctabrdizing enzyme proteins of the present invention that arc described above. Such nucleic acid 
molecules may be naturally occurring, such as allelic variants (same locos), paralogs (different 

10 locus), and orthologs (different organism), or may be constructed by recombinant DNA methods or 
by chemical synthesis. Such nun-oatumliy occurring variants may be made by mutagenesis 
techniques, including those applied to nucleic add molecules, cells, or organisms. Accordingly, as 
discussed above, the variants can contain nucleotide substitutions, deletions, inversions and 
insertions. Variation can occur in either or both the coding and noo-coding regions. The variations 

1 S can produce both conservative and non-cxmservaiive amino acid substitutions. 

The present invention further provides non-coding fragments of the nucleic acid molecules 
provided in Figures 1 and 3. Preferred non-coding fragments include, but are not limited to, 
promoter sequences, enhancer sequences, gene modulating sequences and gene termination 
sequences. Such fragments are useful in controlling heterologous gene expression and in 

20 developing screens to identify gene-modulating agents. A promoter can readily be identified as 
being 5* to the ATG start site in the genomic sequence provided in Figure 3. 

A fragment comprises a contiguous nucleotide sequence greater than 12 or more 
nucleotides. Further, a fragment could at least 30, 40, 50, 100, 250 or 500 nucleotides in length. 
The length of the fragment will be based on its intended use. For example, the fragment can encode 

25 epitope bearing regions of the peptide, or can be useful as DNA probes and primers. Such 

fragments can be isolated using the known nucleotide sequence to synthesize an oligonucleotide 
probe. A labeled probe can then be used to screen a cDNA library, genomic DNA library, or 
mRNA to isolate nucleic acid corresponding to the coding region. Further, primers can be used in 
PCR reactions to done specific regions of gene. 

30 A probe/primer typically comprises substantially a purified oligonucleotide or 

oligonucleotide pair. The oligonucleotide typically comprises a region of nucleotide sequence that - 
hybridizes under stringent conditions to at least about 12, 20, 25, 40, 50 or mare consecutive 
nucleotides. 
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Orthologs, homo logs, and allelic variants can be identified using methods well known in the 
art. As de s cribed in the Peptide Section, these variants comprise a nucleotide sequence encoding a 
peptide thai is typically 60-70%. 70-80%, 80-90%, and more typically at least about 90-95% or 
more homologous to the nucleotide seq ue nce shown in the Figure sheets or a fragment of tins 
5 sequence. Such nucleic acid molecules can readily be identified as being able to hybridize under 
moderate to stringent conditions, to the nucleotide sequence shown in the Figure sheets or a 
fragment of the sequence. Allelic variants can readily be determined by genetic locus of the 
encoding gene. The gene encoding the novel drug-metabolizing protein of the present invention is 
located on a genome component that has been mapped to human chromosome 1 (as indicated in 

10 Figure 3>\ which is supported by multiple lines of evidence, such as STS and BAC map data. 

Figure 3 provides SNP information that has been found in the gene encoding the drug- 
metabolizing proteins of the pr e s e n t invention. SNPs, including insertion/deletion variants 
C'imfcb"), were identified at 45 different nucleotide positions. Changes in the amino acid sequence 
caused by these SNPs can readily be determined using the universal geoetic code and the protein 

15 sequence provided in Figure 2 as a reference. Positioning of each SNP inexans, imroas, or outside 
the ORF can readily be determined using the DNA positions given for each SNP and the start/stop, 
exon, and intron coordi n a t es given in the features. 

As used herein, the Derm "hybridizes under stringent conditions" is intended to describe 
conditions tor hybridization and washing under which nucleotide sequences encoding a peptide at 

20 least 60-70% homologous to each other typically remain hybridized to each other. The conditions 
can be such that sequences at least about 60%, at least about 70%, or at least about 80% or more 
homologous to each other typically remain hybridized to each other. Such stringent conditions are 
known to those skilled in the art and can be found in Current Protocols in Molecular Biology, John 
Wiley A Sons, N.Y. (19S9X 6.3.1-6.3.6. One example of stringent hybridization conditions are 

2 5 hybridization in 6X sodium chloride/sodium citrate (SSC) at about 4 SC, followed by one or more 
washes in 0.2 X SSC, 0. 1% SDS at 50-65C. Examples of moderate to low stringency hybridization 
conditions are well known in the art- 

t^ucleic Acid Molecule Usea 

30 The nucleic acid molecules of the present invention are useful for probes, primers, ch em ical 

intermediates, and in biological assays. The nucleic acid molecules are useful as a hybridization 
probe for messenger RNA, lranscript/cDNA and genomic DNA to isolate full-length cDNA and 
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genomic clones encoding the peptide described in Figure 2 and to isolate cDNA and genomic 
clones that correspond to variants (alleles, orthologs, etc.) producing the same or related peptides 
shown in Figure 2. As illustrated in Figure 3, SNPs were identified at 45 different nucleotide 
positions. 

S The probe can c o r r e sp ond to any sequ e nce along the entire length of the nucleic acid 

molecules provided in the Figures. Accordingly, it could be derived from 5' noacoding regions, the 
coding region, and 3* nonooding regions. However, as discussed, fragments are not to be construed 
as enco mp assing fragments disclosed prior to the present invention. 

The nucleic acid molecules are also useful as primes for PCR to amplify any given region 
10 of o nucleic acid molecule and are useful to synthesize antisense molecules of desired length and 

The nucleic acid molecules are also useful for constructing recombinant vectors. Such 
vectors include expression vectors that express a portion of, or all of, the peptide sequences. 
Vectors also include insertion vectors, used to integrate into another nucleic acid molecule 
15 sequence, such as imo the cellular ge nom e, to alter in situ expression of a gene and/or gene product. 
For example, an endogenous coding sequence can be replaced via homologous recombination with 
all or part of the coding region containing one or more specifically introduced mutations. 

The nucleic acid molecules are also useful for expressing antigenic portions of the proteins. 

The nucleic acid molecules are also useful as probes for determining the chromosomal 
20 positions of the nucleic acid molecules by means of in situ hybridization methods The gene 
encoding the novel drug- metabolizing protein of the present invention is located on a genome 
component that has been mapped to human chromosome I (as indicated in Figure 3), which is 
supported by multiple lines of evidence, such as STS and BAC map data. 

The nucleic acid molecules are also useful in making vectors containing the gene regulatory 
25 regions of the nucleic acid molecules of the present invention. 

The nucleic acid molecules are also useful for designing ribozymes corresponding to all, or 
a part, of the mKNA produced from the nucleic acid molecules described herein. 

The nucleic acid molecules are also useful for making vectors that express part, or all, of the 
peptides. 

30 The nucleic acid molecules are also useful for constructing host cells expr es si n g a part, or 

all, of the nucleic acid molecules and peptides. 

The nucleic acid molecules are also useful for constructing transgenic animals expressing 
all, or a part, of the nucleic acid molecules and peptides. 
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The nucleic acid molecules are also useful as hybridization probes for determining the 
presence, level, form and distribution of nucleic acid expression. Experimental data as provided in 
Figure I indicates that rirug-caelflbolizing enzyme proteins of the present invention are expressed in 
humans in the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland 
5 tumors, head/oeck, sympathetic trunk, breast, and hepatocellular carcinomas, as indicated by virtual 
northern blot analysis. PCR-based tissue screening panels also indicate expression in the brain. 
Accordingly, the probes can be used to detect the presence of, or to determine levels of, a specific 
nucleic acid molecule in cells, and in organisms. The nucleic acid whose level is 

determined can be DNA or RNA. Accordingly, probes corresponding to the peptides described 

10 herein can be used to assess expression and/or gene copy number in a given cell, tissue, or 

organism. These uses are relevant for diagnosis of disorders involving an increase or decrease in 
drug-metabolizing enzyme protein expression relative to normal results. 

In vtiro techniques for detection of mRNA include Northern hybridizations mid in situ 
hybridizations. In vitro techniques for detecting DNA include Southern hybridizations and In situ 

1 S hybridozanon. 

Probes can be used as a part of a diagnostic test kit for identifying cells or tissues that ' 
express a drug-metabolizing enzyme protein, such as by mrasuring a level of a drug-metabolizing 
enzyme-encoding nucleic acid in a sample of cells from a subject eg.. mRNA or genomic DNA, or 
determining if a drug-metabolizing enzyme gene has been mutated. Expcrhnental data as provided 
20 in Figure 1 indicates that drug-metabolizing enzyme proteins of the present invention are expressed 
in humans in the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal 
gland tumors, head/neck, sympathetic trunk, breast, and hepatocellular carcinomas, as indicated by 
virtual northern blot analysis. PCR-based tissue screening panels also indicate expression in the 
beaux. 

25 Nucleic acid expression assays are useful for drug screening to identify compounds that 

modulate drug-membolizing enzyme nucleic acid expression. 

The ixrvection thus provides a method for identifying a compound that can be used to treat a 
disorder associated with nucleic acid expression of the drug-metabolizing enzyme gene, particularly 
biological and pathological processes that are mediated by the drug-metabolizing enzyme in cells 

30 and tissues that express it Experimental data as provided in Figure I indicates expression in humans 
in the stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland tumors, 
beadVneck, syirtpcthetic trunk, breast, and herssfoodlular carcinomas. The method typically includes 
assaying the ability of the compound to modulate the expression of the drug-metabolizing enzyme 
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nucleic acid and thus identifying a compound thai can be used to treat a disorder characterized by 
undesired drug-metabolizing enzyme nucleic acid expression. The assays can be pe rform ed in cell- 
based and cell-free systems. Cell-based assays include cells naturally exp r e ss ing the drug- 
metebohzing enzyme nucleic acid or recombinant cells genetically engineered to express specific 
5 nucleic acid sequences. 

Thus, modulators of drag-metabolizing enzyme gene expression can be identified in a 
method wherein a cell is contacted with a candidate compound and the expression of mRNA 
dctenruned. The level of expression of drug-metabolizing enzyme mRNA in the presence of the 
candidate compound is compared to the level of expression of drug -metabolizing enzyme mRNA in . 

10 the absence of the candidate compound. The candidate compound can them be identified as a 

modulator of nucleic acid expression based on this comparison and be used, for example to treat a 
disorder characterized by aberrant nucleic acid expression. When expression of mRNA is 
statistically significantly greater in the presence of the candidate compound than in its absence, the 
candidate compound is identified as a stimulator of nucleic acid expression. When nucleic acid 

1 5 expression is statistically significantly less in the presence of the candidate compound than in its 
absence, the candidate compound is identified as an inhibitor of nucleic acid expression. 

The Luvcntioo. further provides methods of treatment, with the nucleic acid as a target, using 
a compound identified through drug screening as a gene modulator to modulate drug-metabolizing 
enzyme nucleic acid expression in cells and tissues that express the drug-metabolizing enzyme. 

20 Experimental data as provided in Figure 1 indicates that drug-rnctabolizing enzyme proteins of the 
present invention are expressed in humans in the stomach, brain (including infant X endometrial 
tumors, prostate, kidney, adrenal gland tumors, head/neck, sympathetic trunk, breast, and 
hepatocellular carcinomas, as indicated by virtual northern blot analysis. PCR-based tissue 
screening panels also indicate expression in the brain. Modulation includes both up-regulation (i.e. 

25 activation or amortization) or down-regulation (suppression or aixtagonization) or nucleic acid 
expr e s sio n. 

Alternatively, a modulator for drug-metabolizing enzyme nucleic acid expression can be a 
small molecule or drug identified using the screening assays described herein as long as the drug or 
small molecule inhibits the dnig-cnetabolizing enzyme nucleic acid expression in the cells and 
30 tissues that express the protein. Experimental data as provided in Figure 1 indicates expression to 
humans in the stomach, brain (including infant), endometrial rumors, prostate, kidney, adrenal gland 
tumors, head/neck, sympathetic trunk, breast, and hepatocellular carcinomas. 
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The nucleic acid molecules are also useful for monitoring the effectiveness of modulating 
compounds on the expression or activity of the drug-metabolizing enzyme gene in clinical trials or 
in a treatment regimen. Ibua, the gene expression partem can serve as a barometer for the 
continuing effectiveness of tr e a t m e n t with the compound, particularly with compounds to which a 
S patient can develop resistance. The gene expression pattern can also serve as a marker indicative of 
a physiological response of the affected cells to the compound. Accordingly, such mom to ring 
would allow either increased administration of the compound or the admirustration of alternative 
compounds to which the patient has not bec om e resistant Similarly, if the level of nucleic acid 
expression tails below a desirable level, administration of the compound could be cornrnertsuratety 
1 0 d ecrea sed 

The nucleic acid molecules are also useful in diagnostic assays for qualitative changes in 
drug-metabolizing enzyme nucleic acid expression, and particularly in qualitative changes that lead 
- to pathology. The nucleic acid molecules can be used to detect mutations in drug-aictabouzing 
enzyme genes and gene expression products such as mRNA. The nucleic acid molecules can be 

1 5 used as hybridization probes to detect naturally occurring genetic mutations in the drug- 

metabolizing enzyme gene and thereby to determine whether a subject with the mutation is a! risk 
for a disorder caused by the mutation. Mutations include deletion, addition, or substitution of one or 
more nucleotides in the gene, chromosomal rearrangement, such as inversion or transposition, 
modification of genomic DNA, such as aberrant methytation patterns or changes in gene copy 

20 number, such as amplification. Detection of a mutated form of the drug<netabolizing enzyme gene 
associated with a dysfunction provides a diagnostic too) for an active disease or susceptibility to 
disease when the disease results from overexpreasion, underexpression, or altered expression of a 
drug-rnetnbolrziug enzyme protein. 

Individuals carrying mutations in the drug-metabolizing enzyme gene con be detected at the 

25 nucleic acid level by a variety of techniques. Figure 3 provides SNP information that has been 

found in the gene encoding the drug-metabolizing proteins of the present invention. SNPs, including 
insertion/deletion variants ("indels"), were identified at 43 different nucleotide positions. Changes 
in the amino acid sequence caused by these SNPs can readily be determined using the universal 
genetic code and the protein sequence provided in Figure 2 as a reference. Positioning of each SNP 

30 in exons, introns, or outside the ORF can readily be detennined using the DNA positions given for 
each SNP and the start/stop, exan, and intron coordinates given in the fe atur es. The gene encoding 
the novel drug-metabolizing protein of the present invention is located oo a genome component that 
has been mapped to human chromosome 1 (as indicated in Figure 3), which is supported by 
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multiple lines of evidence, such as STS and BAC map data. Genomic DNA can be analyzed 
directly ox can be amplified by using PCR prior to analysis. RNA or cDNA can be used in the same 
way. In some uses, detection of the mutation involves the use of a probe/primer in a polymerase 
chain reaction (PCR) (see, eg. U.S. Patent Nos, 4,683,195 and 4,683,202), such as anchor PCR or 
S RACE PCR, or. alternatively, in a ligation chain reaction (ICR) (see, e.g.. Landegran et al. , Science 
24 7:1 077-1080 ( 1 988); and Nakazawa et aL , PNAS 91360-364 ( 1 994)), the latter of which can be 
particularly useful tor detecting point mutations in the gene (see Abravaya el al„ Nucleic Acids Res, 
23:675-682 (1995)> This method can include the steps of collecting a sample of cells from a 
patient, isolating nucleic acid (e.g., genomic, mRNA or both) from the cells of the sample; 

10 contacting the nucleic acid sample with one or more primers which specifically hybridize to a gene 
under conditions such thai hybrid ization and amplification of the gene (if present) occurs, and 
detecting the presence or absence of an amplification product, or detecting the size of the 
amplification product and comparing the length to a control sample. Deletions and insertions can be 
detected by a change in size of the amplified product compared to the normal genotype. Point 

1 5 mutations can be identified by hybridizing amplified DNA to normal RNA oi anusense DNA 
sequences. 

Alternatively, mutations in a orug-metabolizing enzyme gene can be directly identified, for 
example, by alterations in restriction enzyme digestion patterns determined by gel clectropboresis. 
Further, sequence-spec ifVc ribozymes (U.S. Patent No. 5,498,53 1) can be used to score fox 
20 the presence of specific mutations by development or loss of a ribozymc cleavage site. Perfectly 
matched s^qneorey can be distinguished from mismatched sequences by nuclease cleavage 
digestion assays or by differences in melting temperature. 

Sequence changes at specific locations can also be assessed by nuclease protection assays 
such as RNase and SI protection or the chemical cleavage method. Furthermore, sequence 
25 differences between a mutant drug-metabolizing enzyme gene and a wild-type gene can be 

determined by direct DNA sequencing. A variety of automated sequencing procedures can be 
utilized when performing the diagnostic assays (Naevc, C.W ., (1995) Bloteckniques 79:448), 
including sequencing by mass speetrometry (see, e.g. PCT lrrtematicmal Publication No. WO 
94/t 6101; Cohen et aL , Adv. Chromatogr. 36: 127-162(1 996); and Griffin et al . Appl. Biochem. 
30 Biotechnol. J6M47-159 (1993)). 

Other methods for detecting mutations in the gene include methods in which protection 
from cleavage agents is used to detect mismatched bases in RNA/RNA or RNA/DNA duplexes 

(Myers et al , Science 230: 1 242 (1 985)); Cotton et al. , PNAS «5:4397 (1 988); Saleeba et aL, Meth. 
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Enzymol 2/7:286-295 (1992)), dectrophoretic mobility of mutant and wUd type nucleic acid is 
compared (Orila et a/.. PNAS SrJ:2766 (1989); Cotton ex a/., AArfar. flat 255:125-144 (1993); and 
Hayashi et of., Gene;. AnaL Tech. AppL 9:73-79 (1992)), end movement of mutant or wild -type 
fragments in polyacrylamide gels r^rrtninipg a gradient of denaturnnl is assayed usi n g denaturing 
5 gradient gel ele ct rophoresis (Myers mt al., Nature 313:495 ( 1 985)). Examples of other techniques 
for detecting point mutations include selective oligonucleotide hybridization, selective 
amplification, and selective primer extension. 

The nucleic acid molecules are also useful for testing an individual for a g en otype that while 
not necessarily causing the disease, nevertheless affects the treatment modality. Thus, the nucleic 

10 acid molecules can be used to study the relationship between an individual's genotype and the 
individual's response to a compound used for treatment (pharmacogenamic relationship). 
Accordingly, the nucleic acid molecules described herein can be used to a ss es s the mutation content 
of the drug-tnetabohzing enzyme gene in an individual in order to select an a ppr o p riate compound 
or dosage regimen for treatment. Figure 3 provides SNP information that has been found in the 

15 gene encoding the drag-metabolizing proteins of the present invention. SNPs, including 

insertion/deletion variants Cfodels"), were identified at 45 different nucleotide positions. Changes 
in the amino acid sequence caused by these SNPs can readily be determined using the universal 
genetic code and the protein sequence provided in Figure 2 as a reference. Positioning of each SNP 
in exons, intrcna, or outside the ORF can readily be determined using the DNA positions given for 

20 each SNP and the Hart/stop, exon, and intron coordinates given in the features. 

Thus nucleic acid molecules displaying genetic variations that affect treatment provide a 
diagnostic target that can be used to tailor treatment in an individual. Accordingly, the production 
of recombinant cells and animals containing these polymorphisms allow effective clinical design of 
treatment compounds and dosage regi me ns. 

25 The nucleic acid molecules are thus useful as rati sense constructs to control drug- 

metabolizing enzyme gene expression in cells, tissues, and organisms. A DNA antisense nucleic 
acid molecule is designed to be complementary to a region of the gene involved in transcription, 
preventing transcription and hence production of drug-metaboUzing enzyme protein. An anti sense 
RNA or DNA nucleic acid molecule would hybridize to the mRNA and thus block translation of 

30 mRNA into drog-roetabolizing enzyme protein. 

Alternatively, a class of anti sense molecules can be used to inactivate mRNA in order to 
decrease expression of drog-metaboliring enzyme nucleic acid. Accordingly, these molecules can 
treat a disorder characterized by abnormal or undesired drug-metabolizing enzyme nucleic add 
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expression. This technique involves cleavage by means of ribozymes containing nucleotide 
sequences complementary to one or more regions in the mRNA that attenuate the ability of (he 
mRNA to be translated. Possible regions include coding regions and particularly coding regions 
o oj r esp ondi n g to the analytic and other functional activities of the drug-man ho lizxng enzyme 
5 protein, such as substrate binding. 

The nucleic acid molecules also provide vectors for gene therapy in patients containing cells 
thai arc aberrant in drug-rneteboiizing enzyme gene expression. Thus, recombinant cells, which 
include the patients cells that have been engineered ox vfuo and returned to the pati e nt , are 
introduced into an individual where the cells produce the desired drug-metabolizing enzyme protein 

10 to treat the individual. 

The invention also encompasses kits for detecting the presence of a drug-metabolizing 
enzyme nucleic acid in a biological sample. Experimental data as provided in Figure 1 indicates 
thai drug-mctabcHzing enzyme proteins of the present invention arc expressed in h u m a ns in the 
stomach, brain (including infant), endometrial tumors, prostate, kidney, adrenal gland tumors, 

15 head/peck, sympathetic trunk, breast, and hepatocellular eare m o m as. as i n d i cated by virtual 

northern blot analysis. PCR-besed tissue scr e eni ng panels also indicate expression in the brain. For 
example, the kit can comprise reagents such as a labeled or labdable nucleic acid or agent capable 
of detecting drug-metabolizing enzyme nucleic acid in a biological sample; means for determining 
the amount of drug-metabolizing enzyme nucleic acid in the sample; and means for comparing the 

20 amount of drug-metabolizing enzyme nucleic acid in the sample with a standard. The compound or 
agent can be packaged in a suitable container. The kit can further comprise instructions for using 
the kit to detect drag-metabolizing enzyme protein mRNA or DN A. 

Nucleic Acid Arrays 

25 The present invention further provides nucleic acid detection kits, such as arrays or 

nucroarrays of nucleic acid molecules that are based on the sequence information provided in 
Figures 1 and 3 (SEQ ID NOS:l and 3). 

As used herein "Arrays** or "Micro arrays" refers to an array of distinct polynucleotides or 
oligonucleotides synthesized on a substrate, such as paper, nylon or other type of membrane, 

30 filter, chip, glass slide, or any other suitable solid support In one embodiment, the microarray is 
prepared and used according to the methods described in US Patent 5,837,832, Chee el ul , PCT 
application W095/1 1995 (Chee e/ aJ.) y Lockhart, D. J. ttal. (1996; Mat. Biotech. 14: 1675-1680) 
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and Schena, M. etal (1996; Proc. Natl. Acad. Sci 93: 10614-10619), all of which are 
incorporated herein in their entirety by reference. In other embodiments, such arrays are 
produced by the methods described by Brown et a!., US Patent No. 5. 807.522. 

The micro array or detection kit is preferably composed of a large number of unique, 
5 single-stranded nucleic acid sequences, usually either synthetic antisense oligonucleotides or 
fragments of cDNAa, fixed to a solid support The oligonucleotides are preferably about 6-60 
nucleotides in length, more preferably 15- 30 nucleotides in length, and most preferably about 20- 
25 nucleotides in length. For a certain type of microanay or detection kit, it may be preferable to 
use oligonucleotides that are only 7-20 nucleotides in length. The micro array or detection kit 

10 may contain oligonucleotides that cover the known 5', or 3', sequence, sequential 

oligonucleotides that cover the full length sequence; or unique oligonucleotides selected from 
particular areas along the length of the sequence. Polynucleotides used in the microarray or 
detection kit may be oligonucleotides that are specific to a gene or genes of interest 

In order to produce oligonucleotides to a known sequence for a microarray or detection 

15 kit, the gene(s) of interest (or an ORF identified from the contigs of the present invention) is 
typically examined using a computer algorithm which starts at the 5* or at the 3' end of the 
nucleotide sequence. Typical algorithms will then identify oligomers of defined length that arc 
unique to the gene, have a GC content within a range suitable for hybridization, and lack 
predicted secondary structure that may inte rf er e with hybridization. In certain situations it may 

20 be appropriate to use pairs of oligonucleotides on a microanay or detection kit. The "pain" will 
be identical, except for one nucleotide that preferably is located in the center of the sequence. 
The second oligonucleotide in the pair (mismatched by one) serves as a control The number of 
oligonucleotide pairs may range from two to one million. The oligomers are synthesized at 
designated areas on a substrate using a light-directed chemical process. The substrate may be 

25 paper, nylon or other type of membrane, filter, chip, glass slide or any other suitable solid 
support. 

In another aspect, an oligonucleotide may be synthesized on the surface of the substrate 
by using a chemical coupling procedure and an ink jet application apparatus, as described in PCT 
application W095/2511 1 $ (Baldescbweiler et a/.) which is incorporated herein in its entirety by 
30 reference. In another aspect, a "godded" array analogous to a dot (or slot) blot may be used to 
arrange and link cDNA fragments or oligonucleotides to the surface of a substrate using a 
vacuum system, thermal, UV, mechanical or chemical bonding procedures. An array, such as 
those described above, may be produced by hand or by using available devices (slot blot or dot 
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blot apparatus), materials (any suitable solid support), and machines (including robotic 
instruments), and may contain 8, 24, 96, 384, 1536, 6144 or mare oligonucleotides, or any other 
number between two and one million which lends itself to the efficient use of commercially 
available instrumentation. 
5 In order to conduct sample analysis using a micro array or detection kit, the RNA or DNA 

from a biological sample is made into hybridization probes. The mRN A is isolated, and cDNA is 
produced and used as a template to make antisense RNA (aRNA). The aRNA is amplified in the 
presence of fluorescent nucleotides, and labeled probes are incubated with the microarray or 
detection kit so that the probe sequences hybridize to complementary oligonucleotides of the 

10 microarray or detection kit. Incubation conditions ere adjusted so that hybridization occurs with 
precise complementary matches or with various degrees of less complementarity. After removal 
of noohybridized probes, a scanner is used to determine the levels and patterns of fluorescence. 
The scanned images arc examined to determine degree of complementarity and the relative 
abundance of each oligonucleotide sequence an the microarray or detection kit. The biological 

1 S samples may be obtained from any bodily fluids (such as blood, urine, saliva, phlegm, gastric 
juices, etc.), cultured cells, biopsies, or other tissue prep ara tions. A detection system may be 
used to measure the absence, presence, and amount of hybridization for all of the distinct 
sequences simultaneously. This data may be used fox large-scale correlation studies on the 
sequences, exp re ssi on patterns, mutations, variants, or polymorphisms among samples. 

20 Using such arrays, the present invention provides methods to identify the expression of 

the drug-metabolizing enzyme proteins/peptides of the p resent invention. In detail, such 
methods comprise incubating a test sample with one or more nucleic acid molecules and 
assaying for binding of the nucleic acid molecule with components within the test sample. Such 
assays will typically involve arrays comprising many genes, at least one of which is a gene of the 

25 • present invention and or alleles of the drug-metaboli zing enzyme geoe of the present invention. 
Figure 3 provides SNP information that has been found in the gene encoding the drug- 
metabolizing proteins of the present invention. SNPs, including insertion/deletion variants 
Cindels"), were identified at 45 different nucleotide positions. Changes in the arnino acid 
sequence caused by these SNPs can readily be deterrnined using the universal genetic code and 

30 the protein sequence provided in Figure 2 as a reference. Positioning of each SNP in exons, 
citrons, or outside the ORF can readily be determined using the DNA positions given for each 
SNP and the start/stop, exon, and intra n coordinates given in the features. 
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Conditions far incubating a nucleic acid molecule with a test sample vary. Incubation 
conditions depend on (he formal employed in the assay, the detection methods employed, and the 
type and nature of the nucleic acid molecule used in the assay. One skilled in the art will 
recognize that any one of the commonly available hybridization, amplification or array assay 
5 formats can readily be adapted to employ the novel fragments of the Human genome disclosed 
herein. Examples of such assays can be found in Chard, T, An Introduction to 
Radioimmunoassay and Related Techniques, Elsevier Science Publishers, Amsterdam, The 
Netherlands ( 1 986); Bollock, O. R. et at. Techniques in Immunocytochemistry, Academic 
Press, Orlando, FL Vol. I (1 982), Vol. 2 CI 983), Vol. 3 (1985); Tijssen, P., Practice and 
1 0 Theory of Enzyme Immunoassays: Laboratory Techniques in Biochemistry and Molecular 
Biology, Ebevkx Science Publishers, Amsterdam, The Netherlands ( 1 985). 

The test samples of the present invention include cells, protein or membrane extracts of 
cells. The test sample used in the above-described method will vary based on the assay format, 
nature of the detection method and the tissues, cells or extracts used as the sample to be assayed. 
IS Methods for pr ep arin g nucleic add extracts or of cells are well known in the art and can be 
readily be adapted in order to obtain a sample that is compatible with the system utilized. 

In another embodiment of the present invention, kits are provided which contain the 
necessary reagents to carry out the assays of the present invention. 

Specifically, the invention provides a compartmentalized kit to receive, in close 
20 confinement, one or more containers which comprises: (a) a first container comprising one of the 
nucleic acid molecules thai can bind to a fragment of the Human genome disclosed herein; and 
(b) one or more other containers comprising one or more of the following: wash reagents, 
reagents capable of detecting presence of a bound nucleic acid. 

In detail, a compartmentalized kit includes any kit in which reagents ere contained in 
25 separate containers. Such containers include small glass containers, plastic containers, strips of 
plastic, glass or paper, or arraying material such as silica. Such containers allows one to 
efficiently transfer reagents from one compartment to another compartment such that the 
samples and reagents are not cross-cootarnionred. and the agents ox solutions of each container 
can be added in a quantitative fashion from one compartment to another. Such containers will 
30 include a container which will accept the test sample, a container which contains the nucleic acid 
probe, containers which contain wash reagents (such as phosphate buffered saline, Tris-buffers, 
etc.), and containers which contain the reagents used to detect the bound probe. One skilled in 
the art will readily recognize that the previously unidentified drug-metabolizing enzyme gene of 
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the present invention can be routinely identified using the sequence information disclosed herein 
can be readily inc or porated into one of the established kit formats which are well known in the 
art, particularly expression arrays. 

5 Vectors/host cells 

The invention also provides vectors containing the nucleic acid molecules described herein. 
The term "Vector" refers to a vehicle, preferably a nucleic acid molecule, which can transport the 
nucleic acid molecules. When the vector is a nucleic add molecule, the nucleic acid molecules are 
covalcntly linked to the vector nucleic acid. With this aspect of the invention, the vector includes a 

1 0 plasmid, single or double stranded phage, a single or double stranded RNA or UNA viral vector, or 
artificial chromosome, such as a BAC, PAC, YAC, OR MAC 

A vector can be maintained in the host cell as an extracbroxnosomal element where it 
replicates and produces additional copies of the nucleic acid molecules. Alternatively, the vector 
may integrate into the host cell genome and produce additional copies of the nucleic acid molecules 

1 S when the host cell replicates. 

The invention provides vectors for the maintenance (cloning vectors) or vectors for 
expression (expression vectors) of the nucleic acid molecules. The vectors can function in 
prokaryotie or eukaryotic cells or in both (shuttle vectors). 

Expression vectors contain cis-octing regulatory regions thai are operably linked in the 

20 vector to the nucleic acid molecules such that transcr ip tion of the nucleic acid molecules is allowed 
in a host cell. The nucleic acid molecules can be introduced into the host cell with a separate 
nucleic acid molecule capable of affecting transcription. Thus, the second nucleic acid molecule 
may provide a trans-acting factor interacting with the cis- regulatory control region to allow 
transcription of the nucleic acid molecules from the vector. Alternatively, a trans-acting factor may 

25 be supplied by the host cell. Finally, a trans-acting factor can be produced from the vector itself. It 
is understood, however, that in some embodiments, transcription and/or translation of the nucleic 
acid molecules can occur in a cell-free system. 

The regulatory sequence to which the nucleic acid molecules described herein can be 
operably linked include promoters for directing mRNA transcription. These include, but are not 

30 limited to, the left promoter from ba cteriop hage X, the lac, TRP, and TAC promoters from £ coll, 
the early and late promoters from SV40, the CMV immediate early promoter, the adenovirus early 
and laic promoters, and retrovirus long-terminal repeals. 
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Id addition to control regions that promote transcription, expression vectors may also 
include regions that modulate transcription, such as repressor binding a les and enhancers. 
Examples include the SV40 enhancer, the cytomegalovirus immediate early enhancer, polyoma 
enhancer, adenovirus enhancers, and retrovirus LTR enhancers. 
S In addition to containing sites for transcription initiation and control, expression vectors can 

also contain sequences necessary for transcription termination *™< m the tran scri bed region a 
ribosome binding she for translation. Other regulatory control elements for expr e ssi on include 
initiation and termination codons as well as poryadenylarioo signals. The pat sou of ordinary skill in 
the art would be aware of the numerous regulatory sequences that are useful in expression vectors. 
1 0 Such regulatory sequences are described, for example, in Sambrook et aJ. , Molecular Cloning: A 
Laboratory Manual. 2nd ed.. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY. 
(1989). 

■ 

A variety of expression vectors can be used to express a nucleic acid molecule. Such 

vectors include chromosomal, episomal, and virus-derived vectors, for example vectors derived 
15 from bacterial plasmids, from bacteriophage, from yeast episomes, from yeast chromosomal 

elements, including yeast artificial chromosomes, from viruses such as bacido viruses, 

papovaviruses such as SV40, Vaccinia viruses, adenoviruses, poxviruses, pseud orabies viruses, and 

retroviruses. Vectors may also be derived from combinations of these sources such as those derived 

from plasrra'd and bacteriophage genetic elements, e.g. cosmids and phageraids. Appropriate 
20 cloning and expression vectors for prokaryonc and eukaryotic hosts are described in Sambrook et 

ai, Molecular Cloning: A Laboratory Manual. 2nd ed., Cold Spring Harbor Laboratory Press, Cold 

Spring Harbor, NY, (1989). 

The regulatory seq u e n ce may provide constitutive expression in one or more host cells (i.e. 

tissue specific) or may provide for inducible expression in one or more cell types such as by 
25 tempo ature, nutrient additive, or exogenous factor such as a bormono or other bgand. A variety of 

vectors providing for constitutive and inducible expression in prokaryotic and eukaryotic hosts are 

well known to those of ordinary skill in the art 

The nucleic acid molecules can be inserted into the vector nucleic acid by well-known 

methodology. Generally, the DNA sequence that will ultimately be expressed is joined to an 
30 expression vector by cleaving the DNA sequence and the expression vector with one or more 

restriction enzymes and then ligating the fragments together. Procedures for restriction enzyme 

digestion and ligation are well known to those of ordinary skill in (he art 
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The vector containing the ap pro p riate nucleic acid molecule can be introduced into an 
appropriate host cell for propagation or expression using well-known techniques. Bacterial cells 
mcfudc, but arc not Hmrti-^ to, E. coli, Streptomyces, and Salmonella typhimurium. Eukaryotic cells 
include, but are not limited to, yeast, insect cells such as Drosophila, animal cells such as COS and 
5 CHO cells, and plant cells. 

As de scri bed herein, it may be desirable to express the peptide as a fusion protein. 
Accordingly, the invention provides fusion vectors that allow for the production of the peptides. 
Fusion vectors can increase the expression of a recombinant protein, increase the solubility of the 
* recombinant protein, and aid in the purification of the protein by acting far example as a ligand for 

10 affinity purification. A proteolytic cleavage site may be introduced at the junction of the fusion 
moiety so thai the desired peptide can ultimately be separated from the fusion moiety . Proteolytic 
enzymes include, but are act limited to, factor Xa. thrombin, and enterokmasc. Typical fusion 
expression vectors include pGEX (Smith et al. , Gene 67-3 I -40 ( 1 988)), pMAL (New England 
Biolabs, Beverly, MA) and pRTTS (Pharmacia, Piscataway, NJ) which fuse glutathione S- 

1 5 transferase (GST), maltose E binding protein, or protein A respectively, to the target recombinant 
protein. Examples of suitable inducible non-fusion £ coli expression vectors include pTrc (Amann 
etal, Gene 69:301-3 1 5 (1988)) and pET 1 Id (Studieref al.. Gene Expression Technology: Methods 
inEnzymology 18S.6Q-&9 (1990)). 

Recombinant protein expression can be maximized in host bacteria by providing a genetic 

20 background wherein the host eel) has an impaired capacity to proteolyticaJly cleave the recombinant 
protein. (Oottesman, S., Gene Expression Technology: Methods in Enzymology 185, Academic 
Press, San Diego, California (1990) 119-1 28). Alternatively, the sequence of the nucleic acid 
molecule of interest can be altered to provide preferential codon usage for a specific host cell, for 
example £ coli. ( Wada etal, Nucleic Acids Rex 20:2111-2118(1 992)). 

25 The nucleic acid molecules can also be expressed by expression vectors mat are operative in 

yeast. Examples of vectors for expression in yeast e.g., £ ctrevisiae include pYepSeci (Baldari. et 
al., EMBOJ. 0:229-234 (1987)), pMFa (Kurjan etal, Cell 50:933-943(1982)), pJRY88 (SchuKz et 
al., Gene 54:\ 13-123 (1987)), and pYBS2 (mvitrogen Corporation, San Diego, CA). 

The nucleic acid molecules can also be expressed in insect cells using, for example, 

30 baculovirus expression vectors. Baculo virus vectors available for expression of proteins in cultured 
insect cells (eg., Sf 9 cells) include toe pAc series (Smitb et al., Mol. Cell Biol. 5:2156-2165 
(1983)) and the pVX series (Lucklow e( aL, Virology MM 1-39 (1989))l 
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In certain embodiments of the invention, the nucleic acid molecules described herein are 
expressed in mammalian cells using mnmmnhan express oo vectors. Examples of mammalian 
expression vectors include pGDM8 (Seed, B. Nature 329:&4G{ 1 987)) and pMT2PC (Ksufinan etaL, 
EMSOJ. 6:187-195(1987)). 
5 The expression vectors listed herein are provided by way of example only of die well- 

known vectors available to those of ordinary skill in the art that would be useful to express the 
nucleic acid molecules. The person of ordinary skill in the art would be aware of other vectors 
suitable for maintenance propagation or expression of the nucleic acid molecules described herein. 
These are found for example in Sambrook, J., Fritsh, E. F., and Maniatis, T. Molecular Cloning: A 

1 0 Laboratory Manual 2nd. ed. Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY, 1 989. 

The invention also encompasses vectors in which the nucleic acid sequences described 
herein are cloned into the vector in reverse orientation, but opera Wy linked to a regulatory sequence 
thai permits transcription of antisense RNA. Thus, an antisense transcript can be produced to all, or 

15 to a portion, of the nucleic acid molecule sequences described herein, including both coding and 
non-coding regions. Expression of this antisense RNA is subject to each of the parameters 
described above in relation to expression of the sense RNA (regulatory sequences, constitutive or 
inducible expression, tissue-specific expression). 

The invention also relates to recombinant host cells containing the vectors described herein. 

20 Host cells therefore include pro kary otic cells, lower eukaryotic cells such as yeast, other eukaryotic 
cells such as insect cells, and higher eukaryotic cells such as mammalian cells. 

The recombinant host cells are p re pai ed by introducing the vector constructs described 
herein into the cells by techniques readily available to the person of ordinary skill in the art These 
include, but are not linulrd to, calcium phosphate transfection, DEAE-dextran-mediated 

25 transfection, cot ionic lipid-mediated transfection, el ectropc ration, transduction, infection, 

Upofection, and other techniques such as those found in Sambrook, el al. (Molecular Cloning: A 
Laboratory Manual' 2nd, ed, Cold Spring Harbor Laboratory, Cold Spring Harbor Laboratory 
Press, Cold Spring Harbor, NY. 1989). 

Host cells can contain more than one vector. Thus, different nucleotide sequences can be 

30 introduced an difTerent vectors of the same cell. Similarly, the nucleic acid molecules can be 

introduced either alone or with other nucleic acid molecules that are not related to the nucleic acid 
molecules such as those providing trans-acting factors for expression vectors. When more than one 
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vector is introduced into a eel), the vectors can be introduced independently, co-introduced or joined 
to the nucleic acid molecule vector. 

In the case of bacteriophage and viral vectors, these can be introduced into ceQs as packaged 
or encapsutaed virus by standard procedures for infection and transduction. Viral vectors can be 
5 repucabon-oompetent or replication-defective. In the case in which viral replication is defective, 
replication wQi occur in host cells providing functions that complement the defects. 

Vectors generally include selectable markers that enable the selection of the subpopulatton 
of cells thai contain the recombinant vector constructs. The marker can be contained in the same 
vector mat contains the nucleic acid molecules described herein or may be on a separate vector. 
10 Markers include tetracycline or ampicillin-resistBnce genes for prokaryotic host cells and 

dihydro folate reductase or neomycin resistance far eukaryotic host cells. However, any marker that 
provides selection for a phcaotypic trait will be effective. 

While the mature proteins can be produced in bacteria, yeast, mammalian cells, and other 
cells under the control of the appropriate regulatory sequences, cell- free transcription and 
IS transUtion systems can also be used to produce these proteins using RNA derived from the DNA 
constructs described herein. 

Where secretion of the peptide is desired, appropriate secretion signals are incorporated into 
the vector. The signal sequence can be endogenous to the peptides or heterologous to these 
peptides. 

20 Where the peptide is not secreted into the medium, the protein can be isolated from the host 

cell by st an d a rd disruption procedures, including freeze thaw, sonication, mechanical disruption, use 
of lysing agents and the like. The peptide can then be recovered and purified by well-known 
purification methods including ammonium sulfate precipitation, acid extraction, anion or canonic 
exchange chromatography, pbosphcccllulosc chromatography, hydrophobic* interaction 

25 chromatography, affinity chromatography, hydroxylapatite chromatography, lectin chromatography, 
or high performance liquid chromatography. 

It is also understood that depending upon the host cell in recombinant production of the 
peptides described herein, the peptides can have various glycosyianon patterns, depending upon the 
cell, or maybe oon-glycosylated as when produced in bacteria. In addition, the peptides may 

30 include an initial modified mcttiooine in some cases as a result of a host-mediated proces s . 
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Uses of vectors and host cells 

The recombinant host cells expressing the peptides described herein have a variety of uses. 
First, the cells are useful far producing a drug-metabo lizing enzyme protein or peptide that can be 
further purified to produce desired amounts of drug-metabolizing enzyme protein or fragments. 

. S Thus, host cells containing expression vectors are useful for peptide production. 

Host cells are also useful for conducting cell -based assays involving the drug-metabolizing 
enzyme protein or drug-metabolizing enzyme protein fragments* such as those described above as 
weD as other formats known in the art. Thus, a recombinant host cefl expressing a native drug- 
metabolizmg enzyme protein is useful for assaying compounds that stimulate or inhibit dxug- 

10 metabolizing enzyme protein function. 

Host cells are also useful for identifying drug-metabolizing enzyme protein mutants in 
which these functions are affected, (f the mutants naturally occur and give rise to a pathology, host 
cells containing the mutations are useful to assay compounds that have a desired effect on the 
mutant drug -metabolizing enzyme protein (for example, stimulating or inhibiting function) which 

15 may not be indicated by their effect on the native drug-metabolizing enzyme protein. 

Genetically engineered host ceDs can be further used to produce non-human transgenic 
animals. A transgenic animal is preferably a mammal, for example a rodent, such as a rat or mouse, 
in which one cor more of the cells of the animal include a transgene. A transgene is exogenous DNA 
which is integrated into the genome of a cell from which atrarisgenic animal develops and which 

20 remains in the genome of the mature animal in one or more cell types or tissues of the transgenic 
animal Those animals are useful for studying the function of a drug-metabolizing enzyme protein 
and identifying and evahialing modulaiors of drug-metabolizing exizj-mc protein activity. Other 
examples of transgenic animals Include non-human primates, sheep, dogs, cows, goats, chickens, 
and amphibians. 

25 A transgenic animal can be produced by introducing nucleic acid into the male pronuclei of 

a fertilized oocyte, eg., by microinjection, retroviral infection, and allowing the oocyte to develop 
in a pseudopregnant female foster animal. Any of the drug-metabolizing enzyme protein nucleotide 
sequences can be introduced as a transgene into the genome of a non-human animal, such as a 
mouse. 

30 Any of the regulatory ox other sequences useful in expression vectors can form part of the 

transgenic sequence. This includes intronic sequences and polyadenylation signals, if not already 
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included A tissue-specific regulatory scquencc(8) can be operably linked to the transgene to direct 
expression of the drug-metabolizing enzyme protein to particular cells. 

Methods for generating transgenic animals via embryo manipulation and microinjection, 
particularly animals such as mice, have become conventional in the art and are described, for 

5 example, in US. Patent Nos. 4,736,866 and 4.870,009, both by I,eder er al, U.S. Patent No. 
4,873,191 by Wagner el al, and inHogan, B., Manipulating the Mouse Embryo, (Cold Spring 
Harbor Laboratory Press, Cold Spring Harbor, N.Y., 1 986). Similar methods are used far 
production of other transgenic animals. A transgenic founder animal can be identified based upon 
the presence of the transgene in its genome and/or expression of transgenic mRNA in tissues or 

10 cells of the animals. A transgenic founder animal can then be used to breed additional animals 
■ carrying the transgene. Moreover, transgenic animals carrying a transgene can further be bred to 
other transgenic animals carrying other transgenes. A transgenic animal also includes animals in 
which the entire animal or tissues in the animal have been produced using the homologousty 
recombinant host cells described herein. 

15 In another embodiment, transgenic non-human animals can be produced which contain 

selected systems that allow for regulated expression of the transgene. One example of such a 
system is the crzAoxF recombinase system of bacteriophage PI. For a description of the cre/laxP 
recornbinase system, see, eg., Lakso et al PAWS .59:6232-623 6 (1992). Another example of a 
recombinase system is the FLP recombinase system of S. ccrevisiae (O' Gorman et al Science 

20 25 1A 351 -1355 (1991). If a crc/loxP recombinase system is used to regulate expression of the 

transgene, animals containing transgenes encoding both the Oe recombinase and a selected protein 
is required. Such animals can be provided through the construction of "double" transgenic animals, 
e.g., by mating two transgenic animals, one containing a transgene encoding a selected protein and 
the other containing a transgene encoding a recornbinase. 

25 Clones of the non-human transgenic animals described herein can also be produced 

according to the methods described in Wihnut, 1- et al. Nature 355:810-813 (1997) and PCT 
International Publication Nos. WO 97/07668 and WO 97/07669. In brief, a cell, eg., a somatic cell, 
from the transgenic animal can be isolated and induced to exit the growth cycle and enter Go phase. 
The quiescent oeU can then be fused, e.g., through the use of electrical pulses, to an enucleated 

30 oocyte from an animal of the same species from which the quiescent cell is isolated. The 
reconstructed oocyte is then cultured such that it develops to morula or blastocyst and then 
transferred to pseud op regnant female foster animal The offspring bom of this female foster animal 
will be a clone of the animal from which the cell, e.g., the somatic cell, is isolated. 
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Transonic animals containing recombinant cells that express the peptides described herein 
are useful to conduct the assays described herein in an in vivo context Accordingly, the various 
physiological factors that are present in vivo and thai could effect substrate binding, drug- 
metabolizing enzyme protein activation, and signal transduction, may not be evident from in vitro 

5 cell-free or cell-based assays. Accordingly, it is useful to provide Don-human transgenic pnimaU to 
assay in vivo drug-metabolizing enzyme protein function, including substrate interaction, the effect 
of specific mutant drug-mctabo Sizing enzyme proteins on drug- metabolizing enzyme protein 
function and substrate interaction, and the effect of chimeric drug-metabolizing enzyme proteins. It 
is also possible to assess the effect of null mutations, that is mutations that substantially or 

1 0 completely eliminate one or more drug-metabolizing enzyme protein functions. 

All publications and patents mentioned in the above specification are herein incorporated 
by reference. Various modifications and variations of the described method and system of the 
invention will be apparent to those skilled in the art without departing from the scope and spirit 
of the invention. Although the invention has been described in connection with specific 

IS preferred embodiments, H should be understood that the invention as claimed should not be 
unduly limited to such specific embodiments. Indeed, various modifications of the above- 
described modes for carrying out the invention which are obvious to those skilled in the field of 
molecular biology or related fields are intended to be within the scope of the following claims. 
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Claims 

Thai which is claimed is: 

1 . An isolated peptide consisting of an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown in SEQ ID NO.2; 

(b) an amino acid sequence of an allelic variant of en amino acid sequence 
shown in SEQ ID NO:2, wherein said allelic variant is encoded by a nucleic acid molecule that 
hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in 
SEQIDNOS:! or3; 

(c) an amino acid sequence of an ortholog of an amino acid sequence shown in 
SEQ ID NO:2, wherein said ortholog is encoded by a nucleic acid molecule thai hybridizes under 
stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS: 1 or 3; 
and 

(d) a fragment of an amino acid sequence shown in SEQ ID NO:2, wherein said 
fragment comprises at least 10 contiguous amino acids. 

2. An isolated peptide comprising an amino acid sequence selected from the group 
consisting of: 

(a) an amino acid sequence shown in SEQ ID NO:2; 

(b) an amino acid sequence of an allelic variant of an amino acid sequence 
shown in SEQ ID NO:2, wherein said allelic variant is encoded by a nucleic acid molecule that 
hybridizes under stringent conditions to the opposite strand of a nucleic acid molecule shown in 
SEQIDNOS:l oc3; 

(c) an amino acid sequence of an ortholog of an amino acid sequence shown in 
SEQ ID NO:2, wherein said ortholog is encoded by a nucleic acid molecule that hybridizes under 
stringent conditions to the opposite strand of a nucleic acid molecule shown in SEQ ID NOS:l or 3; 
and 

(d) a fragment of an amino acid sequence shown in SEQ ID NO.2, wherein said 
fragment comprises at least 10 contiguous amino acids. 

3 . An isolated antibody that selectively binds to a peptide of claim 2. 
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4. An isolated nucleic acid molecule consisting of a nucleotide sequence selected from 
the group consisting of: 

(a) a nucleotide sequence that encodes an amino acid sequence shown in SEQ 

1DN0:2; 

(b) a nucleotide sequence that encodes of an allelic variant of an amino acid 
sequence shown in SEQ ID NO:2, wherein said rnKlcofirfc s e q ue nce hybridizes under stringent 
conditions to the opposite strand of a noclejc acid molecule shown in SEQ ID NOS: 1 or 3; 

(c) a nucleotide sequence that encodes an ortholog of an amino acid sequence 
shown in SEQ CD NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to 
the opposite strand of a nucleic acid molecule shown in SEQ ID NOS: 1 or 3; 

(d) a nucleotide sequence that encodes a fragment of an amino acid sequence 
shown in SEQ ID NO:2, wherein said fragment comprises at least 10 contiguous amino acids; and 

(e) a nucleotide sequence that is the complement of a nucleotide sequence of 

(aKd). 

5. An isolated nucleic acid molecule comprising a nucleotide sequence selected from 
the group consisting of: 

(a) a nucleotide sequence that encodes an amino acid sequence shown in SEQ 

EDNOS; 

(b) a nucleotide sequence that encodes of an allelic variant of an amino acid 
sequence shown in SEQ ID NO: 2, wherein said nucleotide sequence hybridizes under stringent 
conditions to the opposite strand of a nucleic acid molecule shown in SEQ CD NOS: 1 or 3; 

(c) a nucleotide sequence that encodes an artholog of an amino acid sequence 
shown in SEQ ID NO:2, wherein said nucleotide sequence hybridizes under stringent conditions to 
the opposite strand of a nucleic acid molecule shown in SEQ ID NOS: 1 or 3; 

(d) a nucleotide sequence that encodes a fragment of an amino acid sequence 
shown in SEQ ID NO 2, wherein said fragment comprises at least 10 contiguous amino acids; and 

(e) a nucleotide sequence that is the complement of a nucleotide sequence of 

(a>(d). 



A gene chip comprising a nucleic acid molecule of claim 5. 
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7. 



A transgenic non-human animal comprising a nucleic add molecule of claim 5. 



8. 



A nucleic acid vector comprising a nucleic acid molecule of claim 5. 



9. A hosl cell containing the vector of claim 8. 

10. A method fox producing any of the peptides of c.htiro 1 comprising introducing a 
nucleotide sequence encoding any of the amino acid sequences in (a)-(d) into » host cell, and 
culturing the host cell under conditions in which the peptides are expressed from the nucleotide 
sequence. 

11. A method for producing any of the peptides of claim 2 comprising mtroducing a 
nucleotide sequence encoding any of the amino acid sequences in (a)-(d) into a host cell, and 
culturing the host cell under conditions in which the p e ptid es are expressed from the nucleotide 
sequence. 

1 2. A method for detecting the presence of any of the peptides of claim 2 in a simple, 
said method comprising contacting said sample with a detection agent thai specifically allows 
detection of the presence of the peptide in the sample and then detecting the presence of the peptide. 

13. A method for detecting the p re sence of a nucleic acid molecule of claim 5 in a 
sample, said method comprising contacting the sample with an oligonucleotide that hybridizes to 
said nucleic acid molecule under stringent conditions end determining whether the oligonucleotide 
binds to said nucleic acid molecule in the sample. 

14. A method for identifying a modulator of a peptide of claim 2, said method 
comprising contacting said peptide with as agent and dcfcnmning if said agent has modulated the 
function or activity of said peptide. 

1 5. The method of claim 14, wherein said agent is administered to a host cell comprising 
an expression vector that expresses said peptide. 
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16. A method for identifying an agent that binds to any of the peptides of claim 2, said 
method comprising contacting the peptide with an agent and assaying the contacted mixture to 
rWrmir^ whether a complex is formed with the agent bound to the peptide. 

17. A pharmaceutical composition comprising an agent iden t ified by the method of 
claim 16 and a phannaceutically acceptable carrier therefor. 

18. A method for treating a disease or condition mediated by a human drug- 
metabolizing enzyme protein, said method comprising administering to a patient a phannaceutically 
effective amount of an agent identified by the method of claim 16. 

19. A method for identifying a modulator of the expression of a peptide of claim 2, said 
method comprising contacting a cell expressing said peptide with an agent, and determining if said 
agent has modulated the expression of said peptide, 

20. An isolated human drug-metabolizing enzyme peptide having an amino acid 
sequence that shares at least 70% homology with an amino acid sequence shown in SEQ ID NO:2. 

21. A peptide according to claim 20 mat shares at least 90 percent homology with an 
amino acid sequence shown in SEQ ID NO:2. 

22 . An isolated nucleic acid molecule encoding a human drug-metabolizing enzyme 
peptide, said nucleic acid molecule sharing at least 80 percent homology with a nucleic acid 
molecule shown in SEQ ID NOS: 1 or 3. . 

23. A nucleic ecid molecule according to claim 22 that shares at least 90 percent 
homology with a nucleic acid molecule shown in SEQ ID NOS: I or 3. 
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1 


GGCGOCTGCC 


TCCTCTOCCC 


SI 


CTTCTTCCOC 


CGAGTCAGAA 


101 


TGGGCGACCC 


TACGCCAGCT 


131 


GCOCCAGGAA 


ACCCCCCGCG 


201 


CTGCCTGGAG 


ACGCGCTGGG 


251 


GCCTCGOCCT 


SG9GCTGCTG 


301 




GGGACCTGCG 


351 


CCTTGGGCAC 


CA5AAGTTTA 


4 01 


AAATTATTGA 


AAAATACCC7 


451 


CAGGCAT7TT 


TCTGTATCTA 


501 


CAGAACAGAT 


CCCAACTCCC 


5S1 


TTGGAAAAGG 


ACTAGCGCC? 


601 


CGCCTACTAA 


CTCCTGGATT 


651 


GGTGATGCCT 


CATTCTCTGA 


701 


GCAGCAC7CA 


GGACACAACC 


751 


TCTCTCGATA 


TAATCATGAA 


B01 


CACAAACAGC 


ACCCATGATC 


851 


AAATCATATT 


TCACCGCTTG 


901 


rrCAAACTCA GCCCTCAGGG 


951 


GAATCACTAC 


ACAGATACAA 


1001 


CTGGGGTAAA GCAGGATAAC 


1051 


CATATTGTOC TTTCTGCCAA 


1101 


TGATGTACAC 


TCTGAAGTCA 


11 51 


TGCCAGCAAC 


CATCTCCTGC 


1201 


CATCAAGAGA 


6ATGCCOGGA 


1251 


TTCTATCACT 


TGGGACCAGC 


1301 


TCAAGGAGAC 


GTCCCGATTG 


1351 


CTCAGCAAGC 


CACTTACCTT 


1401 


CACCGTGGT7 


CTTAOTATTT 


1451 


GGAAAAACCC 


AAAGGTCTTT 


1501 


6ATCAGAGAC 


ACCCCTATCC 


1551 


CTGCATTGQG 


CAGCACTTTG 


1601 


TGATTCTGCT 


CCACTTCAGA 


1651 


TTCCCCAACC 


ATTTTATCCT 


1701 


GAAGAAACTC TCTGAATGTT 


17M 


TTGTTTTTCG AAGTTAAATT 


ieoi 


GATCAATGTA 


TGCTGCOAOG 


1851 


GAAGACATCC AAAATCATTT 


1901 


TTCTATATAA 


CTTTGGGACA 


1951 


ACTATTAATG 


CTGTATACAC 


2001 


TTAAAATAGT 


TTTCAGAATT 


2051 


GTCAAAAATT 


CCCAACACTA 


2101 


TCACTTCAC7 


TACCCCACAT 


2151 


CTAAAAACAG 


AATAATTTGG 


2201 


TTTTATATGT 


ACAAATGTAG 


2251 


ATATTGTTAT 


TGATrVTTTT 


2301 


AAAAAAAAAA 


AAAAAAAAAA 


(5EQ ID NO: 11 





S'UTR: 1-169 

Start Codoa: 190 

Stop Codoo: 1720 

3*UTR: 1723-2327 
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AGGCCTGAGC TGCCOCTCOC ACTGCCTTTC 
GCTTC3CGAG GGCCCAGAGA G GC GOT GG GG 
CCGGGCGGGA GAAACCCCAC CCTCTCCCGC 
TTOGGCGCTG COCAGAGCCA TGGAATTCTC 
OGCGGC C CTT TTACCTGCCG TTCGTGTTCT 
CAGGCCATTA AGCTGTACCT GOGGAGGCAG 
CCCCTTCCCA GOGCOCCCCA CCCACTSGTT 
TTCAGGATGA TAACATGGAG AAGCTTGAGG 
CGTGCCTTCC CTTT C TG CA T TGGGCCCTTI 
T6ACCCAGAC TATGCAAAGA CACTTCTGAG 
GGTACCTGCA GAAATTCTCA CCTCCACTTC 
CTAGACGGAC CCAACTGCTT CCACCATCCT 
CCATTTTAAC ATCCTGAAAG CATACATTGA 
AAATGATGCT GGATAAGTOC CAGAAGATTT 
GTGGAGCTCT ATGAGCACAT CAACTCGATG 
ATGCGCTTTC AGCAAGGAGA CCAACTGCCA 
CTTATCCAAA AGCCATATTT GAACTCA6CA 
TACAGTTTGT TGTATCACAG TGACATAATT 
CTACCGCTTC CAGAAGTTAA GCCGAGTGTT 
TAATCCAGGA AAGAAAGAAA TCCCTCCACG 
ACTCCGAAGA GGAAGTACCA GGATTTTCTG 
GGATGAAAGT GGTAGCAGCT TCTCAGATAT 
GCACATTCCT CTTGGCAGGA CATGACACCT 
AlOCTTTACT GCCTGGCTCT GAAOCCTGAG 
GGAGGTCAGG GGCATCCTGG GGGATGQCTC 
TGGGTGAGAT GTCGTACACC ACAATGTGCA 
A TT C CT G CAG TCCCGTCCAT TTCCAGACAT 
CCCAGATGGA TSCACATTGC CTGCAGGGAT 
GGGGTCTTCA CCACAACCCT GCTCCTGTCT 
GACCCCTTGA GGTTCTCTCA GGAGAATTCT 
CTACTTACCA TTCTCAGCTG GATCAAGGAA 
CCATGATTGA GTTAAACCTA ACCATTGCCT. 
GTGACTCCAG ACCCCACCAG GCCTCTTACT 
CAAGCOCAAG AATGGGATGT ATTTGCACCT 
AGATCTCAGG GTACAATGAT TAAACGTACT 
TACAGCTAAT CATCCAAGCA GATAGAAAGG 
ATTGGAGGTT GGTGGCATAG GGGTCTCTGT 
CTAGGTACAC AGTGTGTCAG CTAGATCTGT 
TTTTCAGATC TTTTCTGTTA AAV- 111 OCT 
CAATAGACTT TCATATATTf 'f CTGTTGTT7 
ATGCAAGTAA TAAGTGCATG TATGCTCACT 
GAAAATCATG TAGAATAAAA ATTTTAAATC ' 
TCCATGCCCT GACCAATCCT ACTGCTTTTC 
TGTGCATTCT TTCAGACTTT TTCCTATACA 
CAATGTATTT CTATACATGT CATCATTCCT 
CACTTAATAA AAATTCACCT TATTCCTTAA 
AAAAAAA 
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i* protein* : 
TOP 10 BLAST Hits 

gi I 21 17369 I pi r f IA29363 prostaglandin oowgs-hydroxylaa© (EC 1.31. 
gi I 1111661 3 p I PI 0 611 1 CP4 4 RABIT CYTOCHitONS P«SC 4A4 tCYFJVA4l (P. 
gi|I6<9aUgb|AAA31232.1l"(J02818) cytochrome P-4S0p-2 lOryctola. 
glll6S6leablCAA404 93. 1 1 (X57209) anega-hydroxylasa cytochrcne . 
gilB9989lpirl IAJ4260 laurate Onega -hydroxylase- (EC 1.14. IS. 3) c. 
gl ! iniS7)spl P14S79 ICP4S RABIT CYTOCH30KZ P450 «A5 PRECURSOR (C . 
git20J7B7|gblAAA<)038.11 (HVnS) cytochrocae P-450 IVA1 {Rattus. 
gi 109992) pi rl I B3 4160 cytochrome P450 4A7 - rabbit >gi 116*9651 gb. 
gil3738263ldbjlBAA33a04.il (A3018421I cytochrome P-450 [Bus nos. 
gl I Q39323BlreflKP_0S869S.il cytochro-e P4 50, subfamily IVD. pol . 

BLAST to dhSST: 



gb|AW81243SIAN012435 CMl-ST01Bl-261099-026-s02 ST0181 Homo sapi. 
gblRS6515IR36515 yg94d06.rl fioarea infant brain 1HJB Homo oapie. 
gb|AA337301IAA337301 B3T42040 Bndowetrial tuntor Homo sapiens C3. 
gb|AA6S2746iAA6527 4 6 na65c09.sl NCI_CQ*P_Pr22 Hobo sapiens cOHA. 
gb|AAS63360lAAB63360 ©h04f03-sl NCI_CGAP Md3 Horao sapiens CDWA. 
gblAA319338IAA31933B EST215S0 Adrenal gland tumor Koao sapiens . 
gb|Br35S963l8r35S963 CM1-HT0878- 060900- 398-bQ8 HT0B7 6 Horao sapl. 
gblBT4ISa29 I8F4 45B25 nee41d04.xl Mipekl_ayffipath«tic_trunk Hcno . 
gb|AA5373Z4 IAA557324 nl81a02.sl NCI CGAP Br2 Homo sapiens cDNA . 
gblAV6832 66| AV6S3266 AV68 32 66 GKC Homo sapiens cOUA clone GKCDQ. 
gblAM2 64444 I AW264 444 xrO3d03.xl KCI_CGAP_Brn53 Horao sapiens cDN. 

BXFRSSBXOS lOTCEMATIOH POR MdDULATORT U3K : 
library source: 

Bxpresaion Information fro— BLAST dbESrr hits: 

gb IAMB 12 4 33 1 Stomach 

gb IBS 65 IS I Soaxea infant brain 1NIB 

gblAA337391t Encamatriel tumor 

gblAA652746| normal prostate 

gb|AA863360l kidney 

gbtAA31933B| Adrenal gland tunor 

gb|BF35S963lbead neck 

gblBF4 4 38251 Lupski_syinpathfttic trunk 
gblAAS57324| breast 

gb|AV6B3266| hepatocellular carcinoma 
gb IAW264 444 | brain 
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Expression information from PCR-based tissue screening panels: 
Hhole brain 
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1 HE FStt LXTRW ARPFY1AFVF CLftLCLLQAJ KLYLRAQRU. RDLRPFPAPP 

51 THWrUSHQKF XQDOtWEKLS EIIEKYFRAf FrwiGFFQAr PCI YDFOYAK 

101 TLLSRT0PK5 RYLQXFSPPL LGKGLAALOG FKWFQHHRUL TPGFHFNZLK 

151 AYIEVKAHSV KHXLDKWEKI CSTQDTSVEV YEHIUSMSLO JIMKCArSKZ 

201 TMCOTK3THD PYAKAIPELS KIIFHrUYSL LYBSDX1FKL SPQGYRE\JJU 

251 SHVLHOYTDrr HOCRKKSLQ AGVRQDWTPtC RKYQDFUJIV L6AKDC8GS3 

301 FSDIDVHSEV STPUACHOT LMSISVILY CI^AUJPEXQt RCREBVRG1L 

351 GDGSSXTWDQ bGSHSYTTWC lKETCRLIPA VpSISRDLSK PLTFTOCCTL 

401 PAcrrvvLsi bglhhhpaav itkmpkvfopl rfsqehsocp hpyayipfsa 

4 51 G5RNCIGQEF AKIEDCVT1A LILLHFRVTP DPTRPL.TFPN HF1 LKFKNGH 

501 YLHLKKLSEC 



Ytmoticm*! danint «nxl k«y rwjioa«: 

II] PDOC00001 PS00OOJ ASN_GLYC03YLATI0M 
N-glycosyletion site 

206-209 B37H 

{2 J PDOC00C04 PS0C0O4 CAMP_PB05PItO_SITE 

cAMP- and cGKP-dop*rxj*r.t protein fcinsoe phosphorylation site 
number of Hatches: 2 

1 265-268 RXKS 

2 50 5-508 KXLS 

(3 J P DOC 00005 PS00005 PKC_PHOSPHO_SrTE 
Protein kinase C phosphorylation ait* 
Number of notches: 4 

1 159-161 SVK 

2 27B-2B0 TPK 

3 292-294 SAX 

4 374-376 TCA 

11) PDOC00006 PS00006 CK2_PB03PKO_S3T2 
Caaein kinase II phosphorylation site 
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[5] PDOC00006 PSO0OC9 HYRISTYL 
K-B>yriatoylaLion aita 
Kisrber of matches: 3 

1 23-30 CLLOAI 

2 298-303 CSSP3D 

3 353-350 C3SITN 

4 4 51-4 56 GSRNCI 

5 4 57-462 GQEPAM 

[6) POOC00081 P5000Q6 CYTOCflROMJ5 w P450 " 
Cytochrome P450 cysteine heme -iron llgand signature 
4 4 B- 457 FSAGSRNCIG 
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W ant 1 1 s it tpwoiiftQ itrnctura and d i MW ini : 



Helix Beg la 

1 13 

2 76 

3 316 

4 395 



Bod Score Certainty 

32 1.63B Certain 

96 1.029 Certain 

336 1.017 certain 

#15 1.413 Certain 



BLAST Alignment to Top Bit: 

>gl (2117369 Iplrl IA29368 prostaglandin csnega-hydroxyleee (EC 1.14.15.-) 
cytocbrone P4S0 4A4 - rabbit 
Length - 510 

Score - 521 bita (13281, Expect - e-146 

Identities -246/493 (49%), Positives - 355/493 (71%), Gape - 1/493 (9t) 
Fran* - +1 



Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct:. 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 
Sbjct: 
Query: 

Sbjct: 

Query: 
Sbjct : 
Query: 
Sbjct: 



235 lArvrCLALGLt^AIKLYliRRQRIJ^DLRPrPAPPTSMrLGH^FIOODH-MIKLEEl It 411 

♦A ■* L L 1L+A 4LYL RQ LLR V* FP PP KW LGH ♦ Q+O +8 
21 VAA1,U;U^U,UUAQLYU)RQKU£AI^FTCPPFKWL1^ 80 

412 KY PRA FFFW IGPTOR FFCI Y DPDYAKTLLSRT 9FKSRYLQK PS P P L LGK3 LAA LOG PKW F SSI 

K+P A P+W4 +A 4YDPDY K +L B+DFK+ K P +G GL LOG WF 
91 KFPGACPWWLSGtnCARLLVYDPDYLKVILGB£DF7CAPRilYK2inF4nGYGLI^^ 140 

592 QHRKLLTPGFHETlIiJCAVir«TlAHWK>4MI^XWEKIC3TQOTSVEVYEiilNSM3I.D3IMK 771 

QHRR4-LTP FH+-+ILK Y+ +« SV+*ML0+KE++ S QO+S+E4 ++H++ M+LD IKK 
141 QKRr-ML.TPArKrDJ LKPYVGLKVDSVQIMLDRITEQIjI 5- ODSSLE1 rQH v 5 L/fTLDTlMK 199 

772 CAPSKETHCQTN STHDP YAKAI FELSK 1 1 PHRLYS LLYH 5D1J PKL SPQCYRFQKL3A VL 951 

CArS ♦ + Q + Y +AI +L+ Hr+R ♦ + + 8D -M^LSP+G r + +♦ 

200 CAPSYQGSVQLDWSH3YIQAIHDLMHLVFYFAPJ4VPHQSOPLYPJ»8PBGRLrHRADQLA 259 

952 NQYTCT I IQERJQCSLQAGVKQDNTPKBKYQDFLD I VLSAKDESGS3F3 31 DVUSEV3TFL 1131 

++4T0 -HQ+RK LQ •♦ + ++ + AK B+CS5 SD 0+ +EV TF+ 

260 HEHTDR V I QQRKAQLOQEGEUS KVRRKRB LD FLDVLL FAKKEHGS S LS 0QD LRAE VDTFH 319 

1132 lAGHlJTlAASISWIlYCnJUjNraBaWCREF^GII^IKSSITWDQLGEM^ 1311 

GHDT A* +SHI Y LA +PB8Q RCREE++G+LGOG+SITW+ L *M YTTWCIKB 
320 FEGH OTTASGV 3V I FY ALATHPEB QHRCRSE 1 QGLLGOGAS I TWEHLDQN P YTTMC 1 KEA 37 9 

1312 CRLlPAVPSISRDLSCTLTFPDGCTIJAGITVVLSII^raOlPAAVWXNPKVFDPLRFSQ 1491 

RL P VBS++R LSKP+TFPDG +LP G4 * LSI+GLH+KP W+NP4VFPP RF* 
380 I^yppVPS^P4JL3KPVTFPDGRSLPKGVlLrLSIYGLHY»P-KV»aNPEVFOPFRPAP 4 3B 

1492 EMSDQPJlPYAYLPFSAGSaNCIG^FAWlELKVrriALILUlFRVTPDPTRPLTFPNHPIL 1671 

H 4A+LPFS G+RNCIG++FAM BLKV +AL LL F ♦ PDPTR *L 
439 DSA — YH SHA FLPFSGGARNCI GXQ FAMREUCVAV A LTL LR FELL P D PTRVPI PI ARWL 4 96 

1672 KPKNGMYLHLKKL 1710 

K KHG++L L+KL 
497 KSKNGIHLRLRRT, 509 



starch results (-ptmsn) : 

Moda l Dcacription 



Score 



E-value N 



PF00067 Cytochrome P450 
CE00363 E0C36J glycine_receptor_beta 

Parsed Cor domains : 

Model Domain seq-f aeg-t hncn-t hwc-t 



416.5 
2.1 



2.5»-121 
4.7 



■cote E-value 



CE00363 
PF00067 



1/1 
1/1 
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1 
51 
103 
151 
201 
231 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
18S1 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 
2551 
2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 
3001 
3051 
3101 
3151 
3201 
3251 



cca6cctctc ttaggctcct aaatatagtg caaaaagttc cagagttcct 
ttgttaccca tgaaagcaca tggaacggtg ctggacaggg gcaactggcc 
ctggagcaga ggagtaactc catagaactg tccaagcctc agaggcagtc 
acaccaccag caagaacctg ggtgggagta ogtgacccaa ggggttcoca 
ggctctgaoc ctgccaagag aactcattag aaggtcacca accacaca7a 
ctattcctcg gtctcatgaa gaacccaggg accogaccag gcaagatatc 
acaaagctga agtttcagct ctggggcaga gcatggatct gaggtctttg 
gccctaccac catgcgatca tatgagggcc atcatacaac catcrtgatt 
tgggggagga atagggcata gaggaatcat atgaaaagct gaaatgccat 
gagttaccca gaagaagctg tgtaagccag aggattct6a gaocctgtca 
aataacaaca tcta gt t ca a ggttggagtt aggtaggagg tagsgaagtc 
tgggaaagaa ggagctgaaa cacttgctgt gtgtggctta atgsaacatc 
caaggggcca ggacgaactt ggtccagatg aagtcaccac cccctc6cgc 
ctgtc t t t tt tttttttttt tttttttttt tgagacggag tctcactctg 
tcaccasgct ggagtgcagt ggcgcgatct oggctcactg caatctttgc 

CTCTCGGGTT CAAGCGATTC TCCTGCCTCA GCCTCCTGAG TAGCTGGGAT 
TACAGGOGGG CGCCACCACG CCCAGCTAAT TTTAGTACTG TTAGTAGAGA 
TGGGGTTTCA CCATCTTGGC CAGGATGGTC TTCATCCCTT GACCTCGTGA 
TOCGCCCGCC TCGGCCTCCC AAATTGCTGG GATTACAGGC GTGACCCACC 
GOGCCCGGCC CCCTGGAGCC TGTCTTAATC ACTTACCCGC CAAATAAAAT 
CTGGCTCCAG AGAGTGGAGC GTAGGCTTAA GGAATTGGGG GCGGAAGGGC 
GGGGAAGGTG GGGGAGGGAC AGTGATAGGG AGAACAGGGA ATTGTAGCAG 
AAATTGGGTT TATTCTTCAG AGCTGTCAAT GAACACTTAA CATATGCCTG 
TCTTAGCCTA AATCAATGAA TAAATGAATG AATAAATAAA TGAATGAAAT 
GTGGGGAATG CCTATAAAGA TTGCTGGGAC AGGGAGG7GG GGGGAGACAC 
CAGCTTGGGA AGTCAGGCCT GTTAGATCCT AG7TCACCAC . CTGATACGTT 
ACAAATACTA AAACCATCAC TTTCAAATTA TTXTTACTAC ATTTTCCTGT 
TATCTGTACT CCACTTTATT TATGTTTCTG GCATCTAGAG TCAGCCCTTC 
ATGGGCATGA GACCCAAGCA GCCACACGAG GCTCTGAACC CAGAAGA3CA 
TATGCTCGGT TTAATGGTCT GTCATCTTAG AATTGTTAAT AAAGTTTTT A 
TCCCGCATTT TCATTTTGCA CTGAGATTCA 7AAATTATAT AGCAGGCCCT 
GACTGTACCT GTA7AGTGGA ATTACTATAT GA7GGTACGC TACTGTGCAT 
ATCTTCCCCC TTCAGTGTTC AGTGOCCTCG TATCGGCACC TTGAACTAGC 
TOVTGGTACA CGCTGGGAAT CAGGGTGGGA ATCAGTTGTA AA CCA ITT AC 
COGAACACCA CTAGGCAGGC CACAGGATAA AGGAATAATG ATGGTACACC 
TCCCCCTACC TCTACCACCT GGGAATTTTG GTAGAATGCC AGAATQGAAA 
AGAAAATCTC TTGCATAGCC ATTTATAATT TGTGATAAGG AAGAAAAACA 
ATGACCTCAG CTTTAGCATT ATTTTACAAT ATAAATTCAG ATCCCGTGAC 
TGAAAACTGT TGGACTTAAA AGAGGACGCT OCAGGAGCGC AAAAGCAGTT 
GGGCOGAACG AAGCGTGCGC GCTTTGGTAA CCGGCTAGAA ATCCCGCAOG 
CGCGCCTGCC TCCTCTCCCC AGGOCTGAGC TGCCCCTCCC ACTGCCTTTC 
CTTCTTCCCG CGAGTCAGAA GCTTCGCGAG GGCCCAGAGA GGCGGTGGGG 
GTGGGCGACC CTACGCCAGC TCCGGGCGGG AGAAAGCCCA CCCTCTCCCG 
CGCCCCATGA AACCGCCGGC GTTOGGCGCT GCGCAGAGCC ATGGAATTCT 
CCTGGCTGGA GACGCGCTGG GCGCGGCCCT TTTACCTGGC GTTCGTGTTC 
TGCCTGGCCC TGGGGCTGCT GCAGGCCATT AAGCTGTACC TGOGGAGGCA 
GOGGCTGCTG CGGGACCTGC GCCCCTTCCC AGCGCCCCCC ACCCACTGGT 
TCCTTGGGCA CCAGAAGGTA AATGGAAGGG AAAAAGGNTA GAAAAGGAGG 
AAGAGGGGGG CGGAGGAGGA TGCGGCAGAG GAGCCCAGCC GGCAGAGAGA 
CGCAGCTTTC TTCCATCCCT GGGGACCCTC OGGCTTGCAC CGGCCTTTCC 
AGCCCGGCCT GTGCCTCTTA GCATCATTTT TCCTTGCTCT GGAGAATTGC 
TTTCCCGCAG CCCCACAGGG AAAGGTCACA AAAGAGGAAG CTTTGGGGGC 
TGGGAGAGAC CTATTTAAAG AACCTGAATA TGGAAAAAGA AA6CGAGCTG 
TAACTCAACT CTCTCTCTCA TTGCTTCACC AAGCCTTCCA CATGTGTTGC 
TTTAAAAATA GCATGTTATT CTAAATAACT TATTAGTTGC AGAAAATATG 
CAAAATCTAT CCCAATCGTT GGCACCCTTA GTCCATTTTA ACAAGAGAAA 
A TTTTCTTTT CCTAAGATTC TTGTGAAGTA AGGAGCAGCC CCAGCCAGCC 
ACTCGAGAAA TACTGATTGA TGGAAATTTG TAAAGGGAGA CTGTTAGCTT 
TTGGTCTCTC CCGTTTTTTA AATCCACTCC CACCCCTAAT TAAGGTTTTT 
ATTCATTCAA COGACTCTGA GTCGCAATTG TGTGATAGGT ACTAAGATTA 
CAAAGAGAAG CTAAGTCCCT CCCCTGCACC ACCCAAGTCA GGTGCAGACT 
TAGGCCACAG AGAGAAAATG AAAATTTAAG GCAATGGGTO CTTTACTAGA 
GGCCTAGAGA CAAGGGAATA TCTGTCGGAG GAAAGTATAC ATCTCCGCCT 
AGAGAAGGAA GGAAAGTCTG TGAAGGGCTG AGCAGAGTCT TAAAGGATGG 
TTGGGTGGTG rGGGGAAGGC ATTCCAGCAG AGCTACTACA CGATCCTTTG 
GTTTCCCCAC TTTCTAGTCT TTCTTATATA AAGCAACCAC TTTCAACTCT 
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330: TTTATO CG TT TCTTCTGGTA TTT AAA TACT TATTTGTAAA ATACtATTAC 

3351 CATATTGCAT CTATTAATTT AATAAGTTTA GACATCTGCT GTGGTTTAGA 

34 01 TATG G TTTGT TCGTCCCCAC CAAGCCTCAT GTTGAAATTT GATTCCCAAT 

3451 GTTGGAGGTG GGATCTCATG GGACATCTTT GGGTCATTGC GATGCATCOC 

3502 TCATGAATGT CTT & 5TGCAG C TGTCTCCTT CATAAGTTCT CACTCTCTTA 

3551 CTCCCTCTTC AACCCCCACA ACTGA 1 fC TT GAAAAGAGCC TQCCACCTCC 

3601 TCCCCTCTCT CTTCCTGTCT CTCACCATGT CGTCTCTGCA CACAACTGCT 

3651 CCTCTTCACT TCCACTATCA GTGGAAGCAG TCTGAGATCC TCCGCACATC 

3701 CAGATGCCAA TGCCATCCTT CTTCTACACC CTGCAGAATT GTAACCCAAA 

3751 TAATCCTCTT TGTCAATCAC CCAGCCTCAG CTATTCCTTT ACA G CAACAC 

3801 AAA TOT ACT A AGACAACATC CACCTATGAA CTTCTTTATG ACAQGCAATC 

3851 ACTTACACTT CATATTCCAC TGTCOCAGTA ACTATATaGT ATTGrTATTTT 

3901 TTAAATAGAA AAACTTCTAT TTGTATTATT TTTATTATGC AAATGTTATT 

3951 TACTSCTGAT CTAAATGGTC CTCTTTCATT TTATTTCCTT TTCTCAJAGA 

4001 ACTTTTTCOC CACCCCCACA GTATTCNNKJ* KNNWKNWNS NKNtmWJNlW 

4051 KNNKNKMHW KKNNJOCWHN NKKXSNHKKN NHNHSKH HTTN NKmiNNtOTfN 

4101 KNMVtDINKHH KNHNWKHNHM KNWMNHKNM MWHNNNHNN KitKWNHNMH 

4151 moosrononw kntwbnnkkn khnhkknnnn mnooctwoiN kxkxkxnhhn 

4201 NHHKHNNNNN KXHmQiXXKN OTGTTATGTA TCTCTACTGT CTCATGAATA 

4251 CTATGTCGTC TGTTGTTTTA ATTGAATTGT TTTGCCATCC TTGTCAAAAA 

4 301 TCAATTGAQC ATAAATGTCA AGGTCTATTT CTGAGTCTTC AATTCTAATC 

4351 CATTGATCTA TATGTCTATC CTAACTCATG GACACAGAGA GTAGAAGGAT 

4401 GGTTACCAAA GGCTGGGAAG GATAGAGGGC AGCTGGGCGA GGAGGTftGGG 

44 51 AASGTTAATG GG7ACAAAAA AAATAGAAAG AATGAATAAC ACCTACTATT 

4501 TGATAGCATA GCAGGGTGGC TATAGTCAAT AATAACTQTA CACTTTTAAA 

4553 TAAAGAGTGT AATAGGATTG TTTGCAACTC AATGGATAAA TGCTTGAGGG 

4601 CATGGGTAOC CCATTCTTCA TGATGTCCCT ATTTCACATT GCATGCCTCT 

4652 ATCAAAAACA TCTCATTTAC TCCATAAATA TATACACCTA CTATCTATCC 

4701 ACAAGTATTA AAAATTATAA ATAAATAAAT TATATAGCTA TCCTTA7GCT 

4751 AGTACCACAC TGCCTTACTG TTGCTTTGTA GTAAGCTTTG AAATCAGGAA 

4 901 GTATGAGTOC CCCGCACTT? SGTATTT TC C AAGATTATTT TCGCTGTTTC 

4 8 SI GAATCCTTGA TTTCTATACA AATfTTASAC •PCAGCCTATC AATTTCTACA 

4901 AGGAAACCAG CTAGGCTTCT GCTTGGGATT GCACTGAATC .TGTAGATCAG 

4 952 TTTGGGGATT ATTGCCATCT TAAGAATATT AGCTCTTCTC ATCCATGAAC 

5O02 ACAGAAAGCC TTTCCGTTTA GTTAGGTCAT CTTTAATTTT T TTT G TTGTT 

5051 ITTTTTTCrT TT7TGAGACA GAGTOCTGCT CTGTOGCCCA GGCTGGAGTG 

5102 CAGTGACGCA ATCTCGGCTC ACTGCAACCT CC6CCTCTOG GATTCAAGCG 

5151 ATTCTCCTGC CTCAGCCTCC CAAGCAGCTC GGACTACAGG CACATGCCAC 

5201 CACACCAACT AATTTTTGTA TT1TCAGTAG AGAOGGGGTT TCACCATATT 

5251 GGCCAGGCTA GTCTCGAAC? CCTGACCTCG TGATCCACCC GCCTCACCCT 

5302 CCCAAAGTGC TGGGATTACA GGOGTCAGCC ACCACTCOCG GCTTTCTTTA 

5351 ATTTTTTTTA ACGATGTTTT TGTATTTTTC AAACTATACA TCTTGCATTT 

54 01 CTTTTGTTAA ATTTATTTGT TTIXJTTCTTT TTAATTTCAT ITCAGACTAT 

5451 TTATTGCATT CATAGTGTTT TAGAGTCCAC ATTCCCTCTT GACTGTCACT 

5502 AAGTTTTTTT TTTTCTGTTT TTGAGAGGTT TCTATCAGAA TTTTGCAGAT 

5551 CAGAGATGAC GGACATGTCA AACTGTCTAA TATTACCAAC CCTCCCCAT? 

5601 TATCAGAtCA GGATCCTTTT GGTGATTCAC CATGCAGGGA AATCTAGTAT 

5651 CTAAGGCTCA AAAGGTGATA CTGTT7TACA TAGGCAGTAA CATTTTATTG 

5701 CTACATAATA ACTACATATT TATGGACTAC CTGTCATATT TTGATAOGTG . 

S752 CATACAATGT GCAGTGATCA AATCAGGGTG TTTAGGGTAT TCATCACTTC 

5801 TAACATTTAT TATTTATTTG TGtTTCGAAC ATTTCAAGTC TCTTCAAGCT 

58 51 CTTCAGAAAT ATTCAATACA TTATTSTTAA CAGTGCTATT GAACACTGGA 

5901 ACTTATTCCT TCTATCTAAA GACAGTAACA TTTTAAGTAT AGTCATAAGG 

5951 TTACAGAAGG ATAAAGTGTG TATAGGGAAA ATTCCCTACA AGATGAGAAT 

6O01 nCATTCCTT ACTCTTAGTA ATACAGQTCT TCAAACATGC CAAGGATATT 

6052 CCTCCCTTCG AGCTTTGAAC ATGCACGTCT GTGGTTATAT TGCTCtCCCT 

» 6101 GCAAATTATT CCTAAAAGAG GCTTGCCCTG ACCATTCAGA CTAAAATAGC 

6151 ACCTCTAGTA CTCTCTATCT CCAACCCTAT TATTATTATC TTGGCCCTTA 

6201 TCACTCTCTG ACACTATACT GTATACTCTT TltjCTTXiTTC GTTTATTATG 

6251 CACCACTAAC TACAATATAA AATCTCTGAG AGCTAGGATC TTTCTTTGCC 

6302 ACTATAAACC TAGTCCATGG TACAGTTCC? GGTGCATAAT AGGTGCTCAA 

6352 TAAATCCTTT GTTGAATGCA TAAATATATT AGGTGCTGAG AAAATTTATT 

6402 TATTCAAAGA TCAATTTACT GCATAGAATA GGCCAGGTGG TTTGACATTT 

6451 ATTCAATAGC CAACATATGG GACCTAGGAT GTACATATGC AAGTGTGTGT 

6S01 GTCTATGTGT GTGTGCATCT GCATGTGTAC TTGGATGTAC TGCAGAGAAC 
6551 ' ATCTATGTAG CTAAGTA6TA TAAAGCACTT GGGCTCCAGA GTTAAACTG3 
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6601 AGTTTGAATC CTCATTACTG GTTGCCAGCT GTACACACTT GGGCAGATCA 

66 SI TTTAACCTAG TCTGTACCGC TCAATTTCCT CATCTCTAAA GTAQSGATTG 

6701 TAATCATATC TACTTCATAG CGTTCTTGAT GTAAATATTA AATAACATAG 

6751 AACATGGAAA GCATTTAGCA GCACCTAGT7 CATAGCAG7C CTTGATAAAT 

6801 6TTCCCTGTT GCTATTT GGG GGCACTATGC ATTTTCTGAA CATT7CTGAA 

6051 CAATCTTTAC TAAATATATC TACTACCCCT TTTCAACTGT ATTTAGATCC 

6901 TTCTCTGGGG ATGAAGAAAT ATAAATTAAA TATAGTACAG TATTCACAAC 

6951 AGTT7TCTGT CCTTTTTGTC TAGTCAGGAG TTACAAAAAC TATAATGAAA 

7001 TACTTTCATA TGCCTGGCGT CTTTATCAAA ATTTTTTACC TAAACAAACA 

7051 ATTCTCATAT TAGTTTACAA TATTCATGAG GGCAAAGGCC TTCTCTTCCT 

7101 TATATTTCTC TGTATCTCTA CCACCTGGTA CGTGTGATAG ACAA7AAATA 

7151 CT1GTGTGTT TATTGTTTCT AAATGAATAA ATSAAAAAAT ATTCACATTC 

7201 TTGAAAACCA CTACTCT6GA TAOTCAGTGG GTGCTTATCA CTGGCTTGAT 

7251 7ATGCCAACA TTAACAAAAA AGTGCAGTAT TTTAGAAACT A G6 TTT CA AG 

7301 ACTCTCAACC TTTCAGTGGC CTTGAACTAT CCAGAGAACA CTTTATGGGT 

7351 TAAAATTGCT AAATGATAAC AGAGAAAAAT CGGfiGCCAGA GTTGTCCACC 

7«01 TCTCCAGAGG ATGAGAGCAA ACAATCCTGC AGCAGATACC GTQTGATTGG 

7«51 TCACACGAGC AAAAATCTGG CACCCTTAAG ATTACTTTGC AGCGGCG G AC 

7501 TCCCACCATC ATGCTCAACT GTGTAGATGG GCACACCAAA ACACACACAT 

7551 GCAGGTGCCC TCCA C TTT A C ACAASAAGCA AATGTAAAT6 AATCTTGTTT 

7 601 TCACTGATTT AGAGAAACAA TTTAACTGAG CCATTACTCA TCTGCTTCTA 

7651 AAACCAAAAA CTCCITCTCT CCTCGTAGTA TTTGCACTCT CATTTGTAAA 

77 01 TGTTGGAAGC TGAAAGTT7T GTATTTGAGT TTCCTTTAAO ATTCACACAT 

7751 CTGTGTAAAT GGACCTTCTG TTCTTGGGGG GAGAATTTGC ATTTTCTTTA 

7901 TAGATACAGT TGGCAATTTT TTAGAGAGAA GCATTTACTC CTAAGTCATG 

7951 ASAAATAATC ACTGCTGCAT AATTAGAGAG AGGAACAGGA AGAAGAAATG 

7901 GTGAGCTGGA tctagggtga tgccccattt agtaactgtt aotttcccac 

7951 ATAGGAAATA CTTCTTTTTA GCTTCCAGAT CCCACTCCAA TCTGAGTGTG 

B001 TGATGTTGGC AAGTGAGGCA GAGA5TGTGA CTCGGCTCAC CCTCTATTGG 

8051 GACAACAGT7 CACAGTAAAT CTGATTCAAC AGTGACTTCG TCTGGGG6TA 

8101 CA6GATATAT TAATATTGAG AAGATAAATA CACTAACTTT GTTTAGAGAA 

B151 TTATCCCCCA AGCTTAGAAG TCCCAAAGAA AGCATGTTAT GTCACTTCCA 

0201 GAAAAG7CTC A GG CTCCTCT CCTTGTCTCA CCTTATCACG TCCTGAACTC 

8251 A SC TTGTGTC TATAAGAGGC GACAGGTCCA GCTTGGCT6C CTAATTACTT 

8301 TTACT7TTTT CACTGCAGTT TATTCAGGAT GATAACATGG AGAAGCTTGA 

8351 GGAAAT7ATT GAAAAATACC CTOGTGCCTT CCCTTTCTGG ATTGGGCCCT 

84 01 TTCAGGCATT TTTCTGTATC TATGACCCAG ACTATGCAAA GACAC7TCTC 

8*51 AGCAGAACAG GTAAGAAGAG GGGGAAAGCT CTGGGACCTA TTCCTCCTAG 

8501 AAGTGAAATG CATAAAACCC ATAGGCAAGA TTCCAAAGCA AAGATTGCTT 

8551 TGGGGCCTTT AAGAGACACA GCAGCAAGTA TGGGGAGGTG ACAGGTTTCC 

8601 TACCAATACT GAAG GG GATT CCCATATCCT CCCCACTCCC TTGTCTTGTT 

8651 CAGGTATGCA TGGGCACGTT' GAAGTCGGTA TAACTTAAAG CCTA6CTGGC 

8701 ATTACCAGAC TTGCCAGGCA AGGCTTCCCT TGGCCTCTGT GGGTTTTATG 

8751 ACTTCAGTGT CAGCAACACT TCCCACTCCT ACCCCTGGTC TCGAGCATAA 

8801 GTCTCAAGAG CCTGGGAAAT CAGCAGTAAC TCTACCTCTG CTGGTTCAGT 

8B51 ATGAAAGCCT GAATGCTAGA TCATTAATTT ACCCATCAGA CCTCTTGATN 

8901 NNNNNNNNNN NNKNKNUtiNN KNNNNNNNNN NNNNNNNNXN NNNNNNNNXN 

8951 NttHNMNIWMN HNNMKHMNHH NNNNKNNNNN NNNKNNHtJMN NNMNNNNNNN 

9001 NtmMKHKNNN KNNKVMNNNH NNNNNHK1WX NHXNKhWWNN NNKNKNKHNN 

9051 NNNNKJWKHN NKKKKHKNKH KKKKNKKHNU ttNXNKNWTOW NNKNNmOWW 

9101 HNNNKNNNNN KNKNKHNNNN NNNMNNNNNH NNHNKNHCOW NNfc:WNb!NN6I 

9151 NtWNKNNNNN NNNNKMNNNN KNNKNNKWM NttNNKNHNNtt NNKNHNNNNN 

9201 NNNWKNNHMH NNKNKHHNNN NNNNHNNHMN HNKNNNNNNN HNSMMHKHHH 

9251 NNNKKNNMKN NHKNKNNMNN NKMNNNNNNN NUMSKMNMUH NNKNNXNN3N 

9301 KNWNKNHHNN NNNKNNEXNN NNNNKNNNNW rWNNiWNNKN mmKN.fK.SNN 

9351 NNNNKMHNUH KNNNKNNMNN HHKNKNNNNN rfKNNNNNNKN NWXSNNKWIN 

9401 HNNKKKWNNN NNNNNIOJNHN KNNNNNNNNN NNNNNWNNMN NKKKKNKNNM 

94 51 NKHNKHKHtJH NNWNNNNNN KNKNUNHNNH tIKHNNKNNKV 19KKKMNKNNN 

9501 NNNNKNKNNM NNNNNNNNNN NMNNNNNNNN NNNNNNXNKN NNWWrfXNNN 

9551 KKHNNtJHNXN XNNNNNNXKM MHKNNNNNNN tlHNNNNNNKH NKNKKMKNNN 

9601 KKKKKNKNKK MtnWKNWOm MNNNNKHNND JtKHNKWHKKN mmRHKKNKN 

9651 KKKHNNNHKM NNNHNNXXMH ViNNNHNKNNN tWHMHNNMI.'H NHNNNNKNHN 

9701 NKNKKKNNKN NNNHNNKMN74 KNNHNNHNNN NNMNNNNNKN NNNHTCTCCT 

9751 TGACTCTGCA GATCCCAAGT CCCAGTACCT GCAGAAATTC TCACCTCCAC 

9 S3 01 TTCTTGGTAT GTATGTGCAA ATCAGAGGTA TAACCCACTC TCATTCAAAG 

9B51 tcccxttttcc atagtagagc atgccaaaga aactgaaatc tgaattcaaa 
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9901 AGCACAAAGA GTGCAAGGTA GAGCTATACT GAACGTTATC TAGGGCAAAG 

99bl ATTGAAGGGG AGCTCTAAGG TCAACACACC ACCACTTCCC AGAAAGCTTC 

10001 TTCATCC G TT TCTCTCCCAC AAAGTCTTAT TCTCAAGGCA GCAGATACAT 

10031 GAATCTGTCC CCTCTCTCTT TAAAACTACA CCCTTGGOCA GGCACAGTGA 

10101 CTCATGCATG TAATCCCAGC ACTTTGGGAG GCCAAGGTGG GAGGATCAET 

10151 TGAGGTCAAG ATTTCAAGAC CAGCTGGGCC AACATGGTGA AATCCCATCT 

10201 CTACTAAAAA TACAAAAATT AGOCAGGCAT GGTAGCATGT AGGCCTGTAG 

10251 TCCCACTACT TGGGAGGCTG AGACATGAGA ATCGCTTGAA CCTAGGAGGT 

10301 CGACGTTGCC GTGASCTCAS AT7GTGCCAC TGCACTCCAG ACTAOGTGAC 

10321 AGAGCAAAAC TCT GT COG CA GCCCCCAACA ACAAAAAAAA AAC TACXCAA 

10*01 ACTGCAGTCT CACCATCCCT ATTCTTGTTT TCTTTATCCT TCTCTCGTT7 

10451 TCTTGGATGT ITiCC I f' tC T TTTTCGACTT CCTTTATTTC CACATGCGAG 

10501 TCAGTAAAAT TTTGCTCTAG AGTTTGGCAA T A TTCT C TCA GCAGATAAAC 

10551 TAAGCTCTTT AATTACATAA TTGGTATTTA TGTTAAACAA GACATGAATG 

10601 AAAGAAAAGA ATATAGGCTT GTATTAGGAA CCACTTAAAT TTGAATCTTG 

10651 CCCCCTCCTG CATTGACTAG TTAAATATGA TCTTGGGGAA GTCATTTAAT 

10701 CTCTCCCTAT CTCAGTTTCC TCATCTTTGA CAATAAGGAT GAGACTCACA 

10751 TTGCTGGGCT GTTATGAGSA TTAAA7CAAA TACATATTTT TAGCACTACA 

10801 TGTAATGGCC ACCATTCTAT GAGTGACAGA TCATGCATCA TGAGCCTGGA 

10851 AT3TTGTAAG CATTCAATGA ATQCTATCAA TTATGTATTA ATAAACTTTA 

10901 AAGTCCTTTT AAAGCCAAAT CCTAATGAOC AGTCTGGCAA TAGAAGATTG 

109S1 TGAAGCATTA GOCTTGGTAA GTATTTCCAC AT ACT AT CAT TCATAGACCT 

11001 GGGCTCAAGG AGGAAATATC AGQGGACAGA GTGGACACTC TTGTCTCTTT 

11051 CCTTGTGAAT TTATGTTCAT CATATAGTTT ATGGATTGCT TTGGAGTGGA 

11101 AAGGAATTCA CTTGCTCTGT TACTAGTGTC AGCTAGGGAG TAGGTTCGCT 

11151 A2CTTATGTA TTCACTTTCA GTTAACCTCC ACAGCAACAC AGGGAAAAAG 

11201 OTATTTAGTA TCATAGTTCA TTATTGAGAA AAGTAAACCT CAGGAAGATT 

11251 GAGTCACTTA TTCAGTTACT ACATAGGTAG TAACTGGTGA TTTCAGGATT 

11301 AGCGTGCTAA TCTTATAAGG CTTTGAAATT TATTAGACTT TGAAACTGTT 

11351 TCTCACAATA TTAAATACAT CCATOCCAGA OCTAAGCTTC TAAATTCACC 

11401 TTCATCTATT AAATTGCATT GCACATTAAT ACGAGTACTA CTTTGATACT 

13451 CCACTGTTGC AXGACTGCCT GTGGGTCATG GTTACTCCAC GCTGCCTGTG 

11501 TTCCTCATCT ATOCTTCATC TCATCTAATT AAATGSCATA AGGTTTTCTG 

11551 CCTTTTATTT CTCAAGGAAA AGGACTAGCG GCTCTAGACG GACCCAACTG 

11601 CTTCCAGCAT CG TOGCCT A C TAACTCCTGG ATTCCATTTT AACATCCTGA 

11651 AAGCATACAT TGAGGTGATG GCTCATTCTG TGAAAATGAT GCTCGTAAGT 

11701 AAAGGGGGAA AGTGCTCTGT GCATTGOGAA ATGCTCCCAG CAATGGACAG 

117 51 TATTAGGTAT GT G TTTTGTG GGCCATGAAA ATAAAAAATC AGTTTCTAAA 

11B01 AATTTAACCA ATGTACACGT ACTTATTGAA CAATAGGTGT CTGTAAAAAA 

11851 TTTGTTATGT TCTTTGAGTG ATAATATTAA TAAAAAGATC TGGTCCTCTG 

11901 TCTTAGATAT ATTTTGAGAT TTTATGGCAG CAAACCAAGT ACCAAATOGT 

11951 GATAGTTAGA TAGTAAGTGC TGTAGATGT6 TTTCATGGAG GGCGGGTCTG 

12001 TACAAACCTA CCCCAAAGTC TGAGGAAACT GAGAGGCTGA AGAAAAAGGC 

12051 TGACAGTTTC TTAAAAAGAA ACATTCAATA GAGGCT7TCA AACAAAAACC 

12101 ATWKNWTONN NNNNNNKSNH KNNNHKNKNN NNNKNNNNNtt KNNKNTJKNNN 

12151 HNHNNKNNNN NNNMHNKNMI KWNNNNNNN NXNKT4NNKNN HNNNNNNNNW 

12201 KNKKNNNNHil NKNNKHKtfMN NtOtKMUNVNM NWWKMN HKNM WNWNNNNJJN 

12251 WWHHNNNNN mnnnnnnnnw knhmnnnvnm nnnkmhnxnn nnnnwwmnnn 

12301 NMKHNNNNNM KKHtTKlWKNS HXKHHNMHMH CJKWNNKKKHN HNITONNNNNS 

12351 MNKMNNNNOTf HNNMNNNKNM NNNMNNHKNN RNNNNVNKNN NKNKNWHK* 

12401 NNWNNNNNWH KNNNHNWMNH NNNNMMHKNM NWNNNKNNNN NNNKKtWNKN 

12451 NNNNNNNNNN NHNNHNK^HX N^OWtJWWTOJ HKtJKNKMWKN KKNKHMNWhTI 

12501 NNMNXNHWN KHNNHNVMKM tWNNHNUKNN KKNKNUKHKK HKWHNHKKH 

12551 HNNWWMNNNN NXNKNNHNKM NNKHWRtKMN NNNNNMMNNN MHNKKNNNVN 

12601 NHKNKWSNWS NKMVHMNNNM NNNNKHHNKN NNNNNSMMNN HNNNHNNNNN 

12651 MMKWHNMHNN HNNWKMKNKN HKNHNHTOJHN I4NKNNWHKMI4 NNNJtMNNiTON 

12701 KMKNMNNMMN NNNMNXNtWN MNKNNNNXNN NNHNNNKNNN 

12751 KNNWWNNUNN KNKWNNNNKN NNNNT4KNNMJ? HNHNHKNNtiN NNKWMHNNKB 

128 01 KNNNNNNNHN IWNStlNKNKN HKHNH^KNNN KNNWNKHNNN NWNNNNNNNN 

12 951 NNNNSKNNim HKNNNWNWKN NKKNNXNKMK HNRNNKNKHN NNNKNKNNW) 

12901 UNNNNNNNNN HKMHHNNNNN WTH HNX KNNK HNNNNKNKKN NNNNMNNNNN 

12951 KNNNNNSNWN NNNMNNNNNW NNKNNKKNNK WHNHHNNHHN NMMMNNNNNN 

13001 KKKNNlWNNn NTTONNNNNKN NNHNNNNHml BWRHMNNNMN hWNKNNNNNN 

13051 NKKNNNKKNN NNNNNNKtfXSN NNKNNMNNNN NKC4NHNNNNN HNMN3NNNKN 

13101 WJNNNNNKWH NKNNNNHKNN NHKNNKNNNN NHMNNNHNNN NMHNMtlNtWN 

13151 NNSNNNMNHN tWNNWNWNNN NNKrWMWNWN NjmKWWNNN HKHIWNWNNJ1 
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i&roatrcixrDJ hnijnnnnktcj nkwctnnnnh kmnricoihkn nsksknhhms 

KUHKMHSHittl KMNmOniKNH HHtmiHNNKN HKHHNMWHKH KHNMffiNHKN 

NNKftCKHNW KNtMHMNKMH KNWWNNNNN NKNKNHNNNH KXKHNKSHHN 

KHiWMKHHKN KHNHWdfKifH KNtWrfTOOWN XRHHNWNKH «QfKJWlJM«M 

NHKNNKNNNN NNHNMVKHNN K3HKHTWMKN HKKKKNKNNH NNKHUNNKKN 

HHNHMKHKMN KHNNHKNNini KNNWNSWNKN NNNKKHNNKH KNtaKNNNNNN 

KNNKfNHXWJ KNWNWnOlHn NtfKNNHNNXN NtfNUKNJJOTiN KNKKNHOKKN 

MNNVNKKKNH KHKHHKXKHH KNNNKTOJNNN KNKKKFTH ;iN H HKKHUHTOCOI 

NNNMNKMWMN HKKMMKNMMN KNMMtiWtWN HNNNKNNSMN WNKMUHUKMN 

HMSWNWWHNN NNNNWTOWW NOTOiNTWINN NNNNNNXNWW WNNNNNHNNN 

hnnkwmmnn hknhnkhnnii HtnonnontMN nkhhknknnm NNNmnraanw 

WMNKNHNKNN NNNNMHMXHC GTCAGGCTTT GCTGGCGGCA CCTCCCTGCA 
ACAGCTCCTC TCCACACTTG CTCTCTTTCT CACTTTTGAA TCCAAACGTT 
TTTGAAAATG TTCTGAGTTT ATTTTAAAAT GTGGCTATGG TGGTTGAGAG 
CAGTGGCAQS GTACCTAGCA AGTTTGGAAT TGAAGTTGGA GGAAGCCCTG 
GGGTAAACCC CTTGTAATTA TGGGTCTTG7 GTCAATGATT GCTTTAATGG 
AACTCTGGTC TCrtTGAAAG CAGAGTTATG GTAATAATTG AAAAGCGGGA 
GATCTTTAAC TCAGCCATTT ACCATATATG CAGTTTTCTC CATGCTCCTT 
CTCACTCCGC TGGGTGTATT TTTCCCTTCC TOSTGCCCTG TGTAAGCACA 
TGCCTTATTT ACTCATGTGA TCTTTOCTTC CTGCTGGGTC AGGGTTGTCX 
CCATTAGATC ATAAAAACAG GGOCAGCCAG GAGCCTTCAA ATGAAGGCAA 
T7TGGTCATG GTGGTGGTGA TGATGTTGGT CTTGACCTCC TGTGCCAGGA 
TAAGTGGGAG AAGATTTGCA SCACTCAGGA CACAAGCGTG GAGGTCTATG 
AGCACATCAA CTCGATGTCT CTGGATATAA TCATGAAATG CGCTTTCAGC 
AAGGAGACCA ACTGCCAGAC AAACAGGTCA GTGGTGGGAG AGCAAAAAAG 
ATATTTCTTC ACATTTTCTA AGTTGTTTAT TAACACATTA TOCCAACTTT 
CTCTTCTAGC ACCCATGATC CTTATGCAAA AGCCATATTT GAACTCAGCA 
AAATCATATT TCACCGCTTG TACAGTTTGT TGTATCACAG TGACATAATT 
TTCAAACTCA GCCCTCAGGG CTACCGCTTC CAGAAGTTAA GCCGAGTGTT 
GAATCAGTAC ACAGGTATTT GTTGGGTTTG CGTTGCCCAC CTCCATACCC 
TGCCATGATT GTACTGTCTC TGTCTAGAGG GATAAACCTT AATATGACAA 
GAGAAAGAAT CTTTGTTATT AATGGAGCTT TTATATAGAC ACTGCTCCAA 
AGAAATTTGA CTTGAGTCCT TTATAAGACT TTGCTTCAAC CATAGCAGTA 
TTATCAGAAT TTTTATATAT ATATATATAC ACTATTTTTA TTATGGACAA 
TTATTATTAA TACAAATATA AGTAGGCACT TAAGAGTTCC ASACATACAT 
GGAATATGGC T TTTTGCACA GCGATTGCAG TAATAATAAT GACAAGCTAA 
AAACATTCAT GCAACATAQG AATGGAGAGT GGAACAGAGT AAACATGGAC 
ATGCACCCGA AAGAATATTG ATTCAAAAAC AGTTTTAGCA ASCATAAACA 
CAAAAGTTGA AATAGATTAA GCTTTTTAAG CAATTCAACA TTACTTGTCA 
TGAATGCCAT AATGGAGAAT ACTTATCAAQ CAGTGAATTA ATCCTTCATC . 
AGCTTCACCA CTTACTAGCA GTTACTAGTA AGTTACTTAC TGCTTTGTTT 
CAGTGTCATC TATAAAATOG AGATTAAAAA AGAACCTATC TCATACATTT 
GTTGTTACGA TGAGTGGGTT AATATATATA AAGCATTTAG SACAGTGCCT 
GGCACTGAAT AGATGTTAAA TGTAAAGTAT AGTTATGTCA AATGTCTTTO 
CTTCCASGAA TTTTGCAAGA CACACCAACA TATGCACACT TACACATACA ' 
TATATGCATA CATGCACATA GATAITATAA AGAGGACACT CAGAGAAGCA 
GGTTATAAAC AATTTAAGGC ATAAATGCSC ATTATAAATA GCAGCAGTTC 
CCAAGTCTTT CTGCATCATT GCACACACA6 AAAATGTTAA TGTTTTTGTG 
CTTCATTGGA GTAAACAGGA ATGGATTTGG GGGAAGCTAT ACAGAACTTT 
GTAAAAAAAA ATCTTTACTT TTTAAATATT ATACAATTAT GATGAAAAAG 
CAAAATGCAA AGTGTTAGGG AAAATATTAA ATGTTAAATT TATTCAAAAC 
TTAAAACCTT TTCAATTTTT TTTTTTTTTT TTTTTTGAGA TGGAGTCTCT 
ATCACTCAGG CTGGAGCGCA GTGCT O TGAT CTCAGCTCAC TACAACCTCC 
ACCTCCCAGG TTCAGGCAAT TCTCCTACCT CAGCCTTCTG AGTAGCTCGC 
ATTACAGGCA CTGCCACCAC ACCTGGCTAA TTTTTTTAAA TTGTTTATTT 
TTATTTAGTC AAATATATCA ATATTTTATT TTATTGCATC VGGATTTTTA 
GTAATCACAA AAAGCCATTC TCTATTCCAG GGTTTCTCAA CCCTCASCAC 
TAATGGCTTC TTAGATTAGA TAAGTCCTTG TTGTCAAGAT GTGTGCATTG 
TAGGATGTTT AGCTACATCC CTGACATCTA CCCACTCGAT GTAGTAGAGC 
TCTGATAGTT ATAGCAACCA TAAATAACTC CAGACATTAT TGAATGTTCC 
CAGGGCCCCC AGTTGAGAAC CACTGCCCTG 7ACCCAGGTT GTAGAGAAAA 
TTATTTATGT TTTCTTGTA6 TACTTGTATA ATTTCATTAT TTTCATATTT 
AAATCAGAGA TCTAAACTCC ATTTAGAATT TATTCCTftTA TATGGTGTGA 
GGTATTGATC TAAXTTTTCC AAATG7TTAT CCAGTTGTCC CATCACCATT 
ATTTAAAAGT TTATCTTTTC AAGTGATTTG AGATAACCAT CACATTCTAA 
ACGGATACAT GTACTGGTAT CTGTTTTGGA TAAGAGTATA TTTGGATGTT 
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1SS01 CTCCTGTATT OCATT6ATCT ATCTACCAAT GTACCAGAAT CACACTCTTT 

16551 TAATTAAGGA GATTTTGTGG CTTTTTTCAA CATTAATASA CCTT A TTTTT 

' 16601 AGAAAAGTTT TAGGTTTGCA GAAAAATTCA GCAGAAAGTA CAGAGAGTTC 

16651 TCATATTAOC CATCTAACAA ACCTCTACAT GTACOC C TGT ATCTAAAATA 

16101 AAAGTTGAAA TTTTTTAAAT AGTAAATAAA TATTACCTCT GT7CCATATT 

16751 TTT 6 TTTTGT TTTTTTTCTC TCfcCCTCCTT CAATTATAAA 7ATATTGGC* 

16B01 TTTCTTTGCC TGTCTTCTAT TTCATTCCAT TTTATTTAAT AA C TTTTC C G 

• 6B51 TGAAGATAAA ATATTAGACT GAGGAAGAAA AGAATAAT7G GTCACTTGCA 

16901 TCTAAACTTC AAATCATCTT AATTTTATTC CCCACATACT GATGGAAACT 

16951 ATGTTTTTTA TTTCTGTTCT TTATCTTTGG AGCTTTAATC AAAASTCCCT 

17001 TTGATGAGAA AATAAAOCAT CTCTCAAAAT TAGATCTA7T TAAAOCTCTC 

17051 GAAATCAGGC AAGATTTGAA GCTATTCACT AACCATGGCT TCCTTTATAA 

17101 TTTATTTGAC TTTGCCATCA CTTTCOTAAT TCCAAACTAT TTTT C TACCC 

17151 AGATACAATA ATCCAGCAAA GAAAGAAATC CCTCCAGGCT GGGGTAAAGC 

11201 AGCAIAACAC TCCGAAGAGG AACTACCAGG ATTTTCTGGA TATTGTCCTT 

17251 TCTGCCAAGG TAAATCTTCT AAATT7CTAA COCTGCTCAA GTGACCAGTT 

17301 AATTATGTAA GTAGOTGGGT AAGTGOGAAT GGGATGGGGA GACAAGAATA 

17351 AAACCGATTG ACTAAATTTA ACTGTACTTT GAATTGATGA GCAGCTTCAT 

17401 GCAATTTGAG ACAAAGAGAC AATTCTGCAA CTGTGTCGCT AGAGGAGOGT 

17151 TAGTAAAGAC TAAACGAACG ATTTGACAAQ ATTTGAGGAT TGTCATATGG 

17501 ATACATGGAT TTTAGGGCAT CATGAAAAAA TGGTCACATG GATAAACC7TA 

11551 AAAATTATGA TGATAAGGTC CTGGCAAATC TGGGACTTTG AAGAGAA7TT 

17601 CTAGGGCCTG TTGATCGAGG GCC C TTTGTO CAAGGCCTGC TTTTCTTATC 

17651 TAACCTTCCT TCTCCTTTAT GCTTTOGGCA GAATATGGTT TATACCACAT 

17701 ATTTGTTGAA CTGAATTAAA ATTTAAACCC CTATTTAAAG CTCTGATTTT 

177 51 TCCCCTCAAA TCATTATTGT GGTTG7ATCT CCAAACAT7T ATAAACTGGC 

17 801 AITTTATTTA AAATATTTCT ATTCTACTTT CTAGGATGAA AGTGGTAGCA 

17851 GCTTCTCAGA TATTCATGTA CACTCTGAAG TGAGCACATT CCTGTTGGCA. 

17901 GGACATCACA C C TTGGCAGC AAGCATCTCC TGGATCCTTT ACTGCCTGGC 

11951 TCTCAACCCT GACCATCAAG AGAGATGCCG CGAGGAGCTC AGGGGCATCC 

18001 TGGGGGATGG GTCTTCTATC ACTTGGTAAC ATCTGCACCC CTAAATTTTC 

18051 CTGCTAGTTT TCCCCCTGAG ATTTTGCTTT ATTTTTTGOG CTGGTACCTT 

19101 AGTGACCCTA GTGCCTCAGG ATA7CTGTAG GTGAAACAGA AGAAGTAGGC 

19151 TACTTTTCTG TTCTTTCTAA AGAGAGCTCC AAATTATTCT CTTGTCTTTC 

1B201 AGGAAAAAAA AAAAAGTTTA TTTATCCATA AATTGTCT3T CATTGGTTTT 

18251 CTAATCAATG GTGTGTGAAA TGTCTTATTT CTTTATTTCA CCTTGGCTCT 

19301 6ATGCATTGG AAATGAGGAC TTGATCCCTG GGCTGGCACI TAGAACTTAA 

19351 ACAATAGGGT CCAAGTGGAG CTCCTCTTCT GAGAGAGCTG AA7GATTAGC 

13401 TGCATTATTT AAGCCTCATT TTAGACATCT CCCAGCCGCT TGTCACCAAT 

184 51 TTTATTCCTC AGGATTGATT 1TA6ACTTCA GACATAATAT TCGAT6ATAT 

19501 ATACTATAGT TAA C TTTAGC AAATATCGAC TGAGGACATT TTAAATACTC 

19551 AGACTTTTTT TATGACTACA ATTTATTGTG GGCCCTGTCT TCGGTSAGCT 

I960! AATGGTCTAA TACACGAGAC AGGAGAGAGA CCTCCAAATT GCAGTGTAGC 

19651 ATAATGAGGG CAATGATAGA GATATGTGCT GGCTAACACA AAGACATAGA 

187 01 AGACAGGTAC CTACCCTGGC A7GGGAGCTC AAGGAGACTT CC T TG ACATT 

19751 TACGCTGACT GCAGGATAAG TAGGAGTTAG CCAGGTGGAA ACTGTCATCT 

IB 8 01 CTATCTTGCT A6ACTTTAA5 CATATACTGC TGTTAATAAA GCCCAGGTTA 

18851 TGCTGTTTGC AAAGATAAAA TGTGTTCCTG ACATAATACT GGTCAAAGGG 

18901 AGAGAAAGAC AGAAATGCTA AGGACAATTC AGCAGCAGAC CAGATAAAAA 

19951 ACACCATATT TCATATGCAA AAGTCAACTC AATTGAAACA TTTCTAAAAC 

19001 CAAATTTGAC ATTATAAAAG TATATCAGAG ATCTCATTTT ATAAGGAAAT 

19051 AGAAGCCCTT TCCTACCATA AACTAAAGAT TTAATCTATA TAGCACAAAA 

19101 TACAATGTTG AGTAATCATT TTTAATTTAT TTTTTAACTG ACAAAAATTC 

19151 TGCATATACA TGTTATATAT ATATGTATGT GTGTATATAT ATATGATCTA 

19201 CAACATGATA TTTTGATATA TGTATACACT GTGGAATGAC TAAATCTATC 

19251 AATGGACATG TTCATTAACT CATACTTATC ATTTTTTTGT GGTAAGGACA 

19331 TTTAAAATCT ACCCTCTTAG CAATTTTCAA GTATACAAAT TGTTAGTAAC 

19351 TCCAATCACA TATTGTACAA TGCATCTCCT AAACTTATGC CTCCTGTCTG 

19401 ACTGAAATTT TGTATCCTTT GACTAAGATC CCTGTAATCC CCCATTCTCC 

19451 CACAGCCCCT GGTAACCACT CTTCTACTCT CT G CTTCT1T GAGTTTAATG 

19501 TTTTA6ATTT CCACATGTGA GATCATGTGO AATTTGTCTT TCTGTGCCTG 

19551 OCTTATTTCA CTTAGCATAA TGTCATCCAA ATTCATCTCT GTTGTCATAA 

19601 ATGACAAGAT ATTTGTCTTT TCTATGGCTA ATTGTTAGTC CATTGTTTAT 

19651 A7ATATACCA TGTTTTCTTT A7CCATTTAT CCAGTGATGG ACACTTAAGT 

19101 TGATTTCTAT ATCTGGGCTA TTGTGAATAA TGCTGCAATG AACATGGGAA 

19751 TGTACATGTC TCTTCAATGC ACTGATTTCA TTTCGTTTGG TTGTATATCC 
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19931 ACAAGTGGAA TTGCTGCATC ATATGGTAGT TCTATTTTTA ATTTTTTGAG 

19851 GAAACTCCGT ACAATTTTCC ATATGGCTGT ACTAATTTAC ATTCCAAOCA 

19901 AAACTGTATA AGGGTTCTG? TTTCTCCACA TCCTCACCAA CATTTGTCTT 

19951 TTTOGTAATA ACCATTCTAA TCAGCATGAG CTGATCTCTC ATTATGGTTT 

20COI TAAT7TACGT TTCCCTGATG ATTAGTGATG TTGAGCATTC TTTTAAATAC 

20051 CTGCTGGCCA TTCATGTCTT CTTTGTAGGA ATGTTATTTT AGGTTTTTCT 

20101 CATTTTTAAA TCTAGTTATT T C TTTT C TTG CTTTTGAATT GTGTGAGTTC 

20151 CTCATATATT TTGAATATTA ACOCCTTATC AGATGTATCA TTTGCAGACA 

20201 TG7TCTCCCA TOCTTTAACT TGTCTCTTCA CTATGTT3AT TGTTTOCTTT 

20251 GTTGTGCAGA AGCTTTTTAG TTTCCTGCAA AACCATTTAT CTATTTTTTC 

20301 TTCTGTTGAC TATACTTCCA GAGTTGTATC CAAAAAATCA TTGCCAAGAA 

20351 TAATATCAAG AAGCTTTTCT CTATGTTTTT XT CT AST ACT TTTATACTTT 

20«01 CAGGTCATAT GTTTAAATCT TTAATCCATT TTTAGTTGAT TTTTGTATAT 

20151 GGAGTCAGAT AAAGGTCCAC TTTTATTCTT CTACTAGTGC ATATCCAGTT 

20501 TTCTCAACAC CATTTAITGA AGATACTGCC CTTTCACCAC TGTATGTTAC 

20551 TGGAACCTT? GTAGATCAGT TGACAATAAA TGTGTGGGTG TATTTCTCGA 

20601 CTCTTTATCC TGTTTTATTA CTTTATATGT CTCTTTTTT7 AGAACCTCTA 

20631 T GC T G TTTTC GTGACTAGAG CTCTGTAGTC AATTTCAGAT CAGGTAGTAT 

20101 GATGCACTCC AGCTTTGCTC TTTTTGCTCA AAATTGC7TT GCCTATTTGA 

20751 GTTTTT77AT TCCATACCAA TTTTAGGGCT TTTTTTTTTT TTOGATTACT 

20801 GTGAATAATC CCATTGGAAT TTTGATGGAG ATPGCATPCA ATCTTT GG GT 

20851 AST AT SCAT A TTTTAACAGT ATTAATGCTT CCAATTAATO AACACAOGGT 

20901 ATTTTGCAAT TTGTGTTTTC TTCAATTTCT TTCACCAGTG TT T T T TTCTT 

20951 AATTTAATTG TTTTATTTOC ATAGGGTTTG GGTAACAGGT GGTGTTTCGT 

21001 TATGAGTAAG TTC1 1TAGTG GTGATTTGTG AGATTTTGAT GCACCCATCA 

21051 CCTAAGCAGT ATACACTGTA CCCAATTTGT AGTCTTGTAT CCCTCACCTC 

21101 CCTCCCACCA TTTCCCCCAA GTCCCCAAAQ TCCATTGTAT CATTCTTATG 

21151 CCTTTGCATC CTCATAGCTT AGCTCCCACT TATGAGTGAG AACATATAAT 

21201 GTTTQGTTCT CCATTTCTGA GTTACTTCAT TTAGAATATT GGTCTOCAAT 

21251 TCCATCCAGA TTGCTGOGAA TGCCTTTATT TTGTTCCTTT TCATGGCTGA 

21301 GTAGTATTCC ATAGTATATA CATCCCACAA TTTCTTTATC CATTCTT6AT 

21351 TGATGGGCAT TTGGACTGGT TCCATGTCTT TACAATTGCG AATTGTGCTG 

21101 CTACAAACAT GCAGGTGCAA GTGTCTTTTT CATATAATGA CTTCTCTTCC 

2H51 TCTCGGTACA TACCCTGTAG TGGGATTGCT GGATCAAATG 6TAGTTCTAC 

21501 TTTTAGTTCT TTAAGGAATC TCCACACTGT TTTCCATAGT GGTTGTACTA 

21551 GTTTACATTC CGACCAACAG TGTAGAAGTG TTCCCTGTTC ACTGTATCCA 

21601 CACCATCATC TATTATTATT TGATTTTTTG ATTATGGCCA TTCTTGCAGG 

21651 AGTAA5GTGG TATTGCACTG TGGTTTTGAT TTGCATTTCC CTGATCATTA 

21701 GTGATGTTGA GCATTTTTTC ATATATTTGT TGGCCATTTG TACATOTCT 

21751 TTTGAGAATT CTCTATTCAT GTOCTTTGTC CATTTTTT G A TGGGATTATT 

21901 TGTTTTTTTC TTGCTAATTT GAGTTCCCTG TAGATTCTGG ATATTAGACC 

21B51 TTTGTTGGAT GTGTAGGTTC TGAAGATTTT CTCCCACTCT TTGGGTTGTC 

21901 TG TT T ACTCT GCTGATTATT TCTTTTGCTG TGCAGAAACT TTTTAGTTTA 

21951 ATTAAGTCCC ACCTATTTAT CTTTTCGTTG TTGTTGTTTT TTGGGGTTGT 

2Z001 TTTGTTTTGG CTTGGTTTTG CATCTGCTTT TGGGTTCTTG CTCATGAAGT 

22051 CTTTGCCTAA GCCAATATCT AGAAGGGTTT TTCTGATOTT CTAGAA7TTT 

22101 TATGGTTCAG G TCTTAGATT TAAGTCCTTG ATCCATCTTG AGTTGATTTT 

22151 TGTATAAGGT GAGAGATGAG GATCCAGTTT CATGCTTCTA CATGTGGCTT 

22201 GCCAATTATC CCACTACAAT TTGTTGAATA GGGTTAATAT TTAAAGCTTT 

22251 ATATATTTAG GTGTTCCTAT TTTGGGTACA TATTTATTTA CAACTATCA7 

22301 ATCCTCCT6A TGGATTGACC CCTTTCTCAT T AT ATAATGG ltmriW 

22351 TCTTTTTACA GTTTTTGTCT TAAAGCCTAA TTTGTCTGAT AAAAGTTCAG 

22401 CTACCTTTGC TCTCTTTTGG TTTCTATTTG CATGGAATAT TTTTTTCCAA 

22451 CCCTIOGCAT TCACTCTATG TGT G TTCTTA AAGATGAAAT GAGATGCTGT 

22501 AGGGGCATAT G C TT GG GTCT TGTTTTATTC ATTOVTTCAG CCACCCTTTT 

22551 GATTAGAGAA TTTAATTCAT TTGTATTCAA GGTAATTATT GACAGACAAG 

22601 GACTTACTAC TGCCATTTTG TTAATTGTTT TCTTGATGTT TTATAGATCT 

22651 TTTCTTCCTT TCATCCTCTC TTACTCTTTT CCTTTGTGAT TAGGTGCTTT 

22701 TCTCTAGTGG TGTACTTTGA TTTTTACTTT TTATCTTTTG TTGCTCTACT 

22751 ATA0OTTTTT GCTTTGTGGT TAOCAIGAGG GTTACATAAA GCATAGTTAT 

22801 AAAAGGCTAT TTTAAACTGA TAACAGCTTA ACTTTCAACA CTTAAAAAAA 

22851 CTATACACTT TTACTCTACC AACTGCCCTC CAT7TTATGT CTTTGATGTC 

22901 ATAATTTACC TAGTTTTGSA GATGTGTCOC CTTATT G T G T ATCCCTTAAC 

22951 AAATTATTGT A5CAACAGTC ATTTTTAATA GTTTTGGCTT TTAACTTTAT 

23001 ACTAGAGATA GAATTAATTA ACATACCACC ACTACATTAT TAGGGTATTC 

23051 TAAATTGACT ATGTATTTAC CTTTATCAGT GAGATTTTTG TTTTCAATTT 
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TCATGTTGTT AATTAGTATT CTTTCATTTC AACTTGGAGA ATTCACAT7A 
GCATTTTTTC TAAGATGGGT CTAGTAGTGG TGAACACCCT CAACTTTTGT 
TTATCTGGAG ATGTCTTTAC CTCTGCTTCA TTTTCAAATA TAACTTTTGT 
TCCATGATTG AAATGGACAA AATTCTTTTT TTAATTATGC AAACTGCCAC 
GGTAAGCAGA ATTACTCTTT TIITT T TT T T CTGAGACCGA GTTTCACTCT 
TGTTGCCCAG CCTGGAGTGC AGTGGCGCAA TCTCTCAGCT TACCCCAACC 
TCTGCCTCCC AGGTTCAASC GATTCTTCTG CCTCACCCTT CCTGACTAGC 
TGGGATTACA GCCATGCACC ACCATGCTCG GCTAATTTTG CATTTTTAGT 
AGAGACGGGG TTTCTCCATG TTGGTCAGGC TGGTCTTGAA CACCGGACCT 
CAGATGATCC GCCCACCTAG GCCTCCCAAA CTOCTCGGAT TGCAGGTGTG 
AGCCACTGCG CCTGGCCAGA ATTACTCTTA TTTATCCTGA GCTTGACGAA 
GAAAGAATTC AAAATTAAAA TTTCACATTA CCTAATCGCC AAAGCCTGCA 
TTCAAAATAA GTAATCAGAA AAACATATAA AAACACAATA AGATAAACAG 
ACTAAAIATA TGCACTCATT TTKTQGAACC AATCTGACTA GATTGGATGC 
A5ACTASCTA CGATCCAAAT TTAAAAAAAA CTTTATTCTT CTTCCACTTA 
TAAACTTTAA ACCTGCTTTG TGGAGCAAGT TCTTTTTATC TCTGGGGAAA 
GATCCTGAGT AAGTCTCATA GAGTTCTCAT TOWTTAAAT CACAAGAACA 
ATCTTASGTC A5TAATTAAA CTATCTGCCC CAGTGTAATA CTGAAACTTT 
CAAATACTTA TCCACTTGAG C TCTF C TT IC CATCCCAGC7 TGGTACTTCT 
TTGGTCCTAG AAGCCAGCAG TGvJTTTATCA TCGACTTATT CTTACTGACT 
AGCTCCCCAA TACCCAGTAG CTGCTCTTTC TGGCCCCTCC AGGAATGGTT 
TTAGGAGGAA ASOGGATAAG GAGTAAAGGG CTGGTACTAT TGTSATCATG 
CCAAAGGGCT TGGTGGATAT TCCATGCTTC CCTTTCTCTC AAGAGGAAAC 
TCC C TTTCTT GCAGACTCTC TCACTAGAAC TTTCCAGAGG TGAT7CASGG 
GACAAGAGAA TAATTGTCCT TAGGCAGACT CTTTTTCAA5 CTGGTCCCAG 
AGCTTTCCCT CTTGCCAGTT AATTGGTTTA AGGACACAGT TGCACATCCT 
T CC CTTGCCT CTGCTGCTGT CCTCTGCCTT TCTGTCTGTT CTGAGTTATA 
G CC TTTCACA TCAGTCCTGT ACTCOCCAAA CTCCAAGGAG CACAAGTCAG 
ATCATCTAAC TGATCCTCTT CAAGCCTCTT STTTAAGATG GGGGAAGCAC 
CC TTCCTTTT CCATGGCACT CTGCCATTCC AACAACACTT TAAATAAJTT 
TTTCTCTCAA AATTCTTAAG CCTCTCCTCT TTAATCCTTC GCCATTTTTA 
TGTATTATTA CTTTATATGA TGAGCTAAGA GTTACAAAAC TGGTTTTTAG 
AAATCTCCTT AGCAAATGTT TTACTSCTAG TTTAGCAGCT CAC7TTATAA 
TAAGGATATA TGATATATTT CTTT05TTCC TCT GC CTCTG GGACCTCAGC 
TCATCCTGAG GCAGAGAGTC CCATTTTAAC ATTCTCTTAC ATAAAOCAGT 
CGCAAAATCG CTTTAACCTG AGGGTAATAA TTACCAGGAA CAAACAGAAA 
ACAGAAAAAA AGTAAACTGG TTATGATATC TGAGTCCCTT CCCTCOCTCA 
TCCTCACAGG GACCAGCTGG GTGAGATGTC GTACACCACA ATGTGCATCA 
ASGA6ACGTC CCCATTGATT CCTGCACTOC OGTCCATTTC CAGAGATCTC 
AGCAAGCCAC TTACCTTCCC AGATGGATGC AC AT T C OCTG CAGGTCTTTA 
CATTCTTTTC CTAAGCAGTT CJTAGAGGCT ATGOGATCCT GGAGACCACA 
GTGACAAAGA TTAGTGACTC TCTTAGCACT TGGAGAAGTC AAAAGATAAT 
GCTAACATCT GACTTASGTT TTATCACCT* TGAGGAGCTC AGAGGATAAT 
GCTTTGGTCA GACATGAATT TCAATGACTT TCCCAAAGGC ACATAGCCAG 
TTGCAGCAAA GCTAAGCCCA CAATOCATGT CTCTGGAATC CCAGCOCAGG 
GTCTCTTCCA TTGTGGGACA TCATTTCTAA GATAATCTTT STTT G CCT GA 
CTTTGAGACC GAGCTGAAAC TTCATGGAAA ATAGCACCAG CATCTTTATC 
TGAAAGACCA AGGCGGATCT TTGGCCTCAT CATCATAATA TCACCCTTAT 
AAATATACAA CATTTAATAG TTAATATAGA GCCTTCAGAC CCATTATCTC 
ATT7TTCCCC TTGGAATCCA ATGTTAACAG ATGCT7ATAC AATGATTTAC 
AGTTCACTGA ACACTTTTAA GTACTTTCAA TGTGG C CCAA AATCCAGAGG 
CAGCCCCAAT GTGTAGATGA CATTAACTGA TGTGAGCAGA GCTAGAACTT 
GTGCGGAGAC CCTGAGTCTG GAGCCTAGAG TTCTTCCGAA CAACACAGGT 
TTCTGAGCAG GGCTTATAGG AAGCAGAGGG GTCATCTGAG ACATATTATC 
TGATTCAATG TTCTATTAAT TCATGTCTTA GGAAGCAAGC CAACAGGATT 
GCTTCTGGCA AACACCTACA GOCTGTTACT GTAACTTTGC TGACAGACCC 
ACAATTAATT TCTGGAAGCT AGAATTATTT CTGGAAACCA AATAACCCTC 
ACATTCTCTC TOCTTTGTTT TGTACTCTGT TTCTCCCCAA ACCACWTGGA 
TATTTGCCAA. AATTCTCCAC TTTCCATATG TGAATAGCAC CAATGGAAAT 
TTCTCATGGG ATCTCCATGA CAGAATCACA GTTCTCTGTG TCTGTGTGTG 
OGT7TTCCTC TCAAGACAGA GTCTTGCTAT GTAGCCCAGG CTGGAGTAGA 
GTGGCGTAAT CTCGGCTCAC TGCAACCTCT GCCTCCCAGG TTTAAGCAGT 
rCTCCTGCCT CAGCCTCCCG AGTACCTGGG ATTACAGGTG CACACCACGC 
CTGGCAAATT TTTGTATTTT TAI7AGAGAT GGGGTTTCAC CATGTTGGCC 
AGGCTAGTCT CAAQCTCCTG ATCTOGA&AC CAGCCCTCCT CAGCCTCCCA 
AAGCGCTGGG ACTACAGCCA TGAGCCACTG CACCCAGCCA GTTCTGTGCT 
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TTTATACCTA AATTGTCTCC AGGAGTGCTT AATAGTCCAT TAATAGGTAT 
TTACGCCAGG CACACTCCCT GACGCATATA ATCCCAATAT TTTCTGACAC 
CAAGCTGGGA AGACTCCTTG AAGTTAGGAG TCTGAGACTA GCCTGGGCAA 
CATAGGGAGA LCCTGTCTTT ACAAAAAAAA AAAAGAGAGA GATAGCCAGG 
CATGGT GTT G CATGCTTGTA TTOCTGCCTA CTTGGGGGAC TGAGGCAGGA 
GGATCACTTG AGCTCAGAAG TTCAAGGTTA OCCTGACCAA TCTTCACCCC 
ACTGCTCTCC AGOCTGATTG ACAGGCCAGA OCCTGACTC7 AAAGAAAAAC 
AAAAAACAAA TATTTAAGTA ATTTCCAAAC ATAGCAGAAA ATATAAGCAT 
GGTTTATCAC TTTGATATGA CAOCAACAGC TACTTAAGAT AGAGTCATGA 
ATT CAST AAA TTGTTGTCTG GAAAGCTAAS GTGCCAACCC AAGCCGCATC 
TTCTTAGCTG CTCCTCACTG GTGTCATCAG CTACAGCAGG CAGAGCATTG 
CCAGGAGCTA GCTCTTCCCT TCAAGAACAA AAGTCTTGTT TAAGAGCACA 
GTAGCCCACA ACTTGCTCTT TCTCCTGCAG TCTCTTTTAT TTCCCTCCTT 
TCTTAGGGAT CACOGTG6TT CTTAGTATTT GGGGTCTTCA CCACAACCCT 
GCTGTCTGGA AAAACCCAAA GGTATGATTC T CT CT T GTAC ATAAATACTT 
CCAAGAACTA ATGCTGTGCA AGTCACTTTT TGGTAGCTAA GCACAGAAGT 
GGCTATATAA TTAAGGGAAA TGACACAAAT TAAACAAAAA TAAACATAAA 
AGCCAAAAGA AATGTAAAAC TATTCTATGT TCTTCAAACA CTCTTGACGT 
GTATCA3TGA TTTCTTTCAT GTAAGCCACT AAGGTTTAAS ATCTATTACT 
TGTAACAGGA AGCTGGAGTA TATGTCTCTG TAATAATTGG CCACATCATC 
ATTTTGACTT GATTTCTAAG TGGATGCACA TCCATTTCTA AGTGGATGTA 
TCTCCATACT GAAAATAATA CCACTTGCCA TAGTATTTTT GTTTGCCTGC 
GTATGAGACA AATCAGCTGT 6AAGCTGCAA GGTCTGCAGG TCTSAAGGTA 
CACTGCCCAG TGTAGTAGCC ACGGGCCACA TACGOCTACT GAGCACATGA 
CATGTGGCCA GTTGGAATTG AGTTGTGCTG TAAGTTTAAA ATACGTGCTG 
GATTTTGAAG ACATAGTACC CTAAAAAAAT GTGAAACATT TCCTTTTAGT 
AATTATTTAT ATTGATTACA GGTTGGAATG GTAATTTTTG S7TAAATAAA 
CTCTATTAAG ATTAACTTCA CCTTTTAAAA ATGTGACCAC CAGAACATTT 
TAAATTACAC ATGTAGATCA CATTATATTT CTATTGATCG GTGCTAGGTG 
GTACCTGAAG AAATGTCTTC ATGTT G TTTG GGGGATGGTG TTGGCGTTST 
CCTCTCATTT CAGGTCTTTG ACCCCTTGAG GTTCTCTCAQ GAGAATTCTG 
ATCAGAGACA CCCCTATGCC TACT7ACCAT TCTCAGCTGG ATCAAGCTGA 
GAACAATTTC AACTTGCTGA AAGTACCCAA AGATGTTTAC TTGAGAGTAG 
TTTATTCCTT TCAGCTCCTC AGCTCTATAC ATTCTTCCAG CGAACOGTAC 
ATCTTGGTGC CTATTTGAGC CCCAAAGGAT CAGTTAGTTT TACAAAOGAC 
AATCGTATTC TCTGTCACAT CCTTTTTGGC CATGCCTCAA AAGCAGTCCC 
ACAATGTAAG CTACTGCTCA TAGGCTCAAT GCAGTCCACC TTCAAAGCAA 
GAGAAATAAT TTCATGAGTA ACTCCAACTC CCGCCTTGTT ATAGGGAAGG 
CATCATGTTG GAGCCTCCCA GCTCAAATTC TCACACTCAA CAATTTAAGT 
CTAAAGTTCA AAAGTTTCAA TGGCATTTGG TGGAAAAAAT ATCACTTTAC 
TGTGTACTTC AGACTTCTTG TACT ACT ATT TTACTATAGT CAGAAGAAAC 
ATCATTTTTr CAAGTATCAC TTTCTTTCCC TCTTGTCTTC AGGAACTGCA 
TTGGGCAGGA GTTTGCCATG ATTGAGTTAA AGGTAACCAT TG CC TTGATT 
CTCCTCCACT TCAGAGTGAC TCCAGACCCC ACCAGGOCTC TTACTTTCCC 
CAACCATTTT ATCCTCAAGC CCAASAATGG GATGTATTTG CACCTGAASA 
AACTCTCTGA ATGTTAGATC TCAGGGTACA ATGATTAAAC GTA C TTTCTT 
TTTCGAAGTT AAATTTACAG CTAATGATCC AAGCAGATAG AAAGGGATCA 
ATCTATGGTG GGAGGA7TGG AGGTTGGTGG GATAGGGGTC TCTGTGAAGA 
GATCCAAAAT CATTTCTAGG TACACAGTGT GTCAGCTAGA TCTGTTTCTA 
TATAACTTTG GGAGATTTTC AGATCTTTTC TGTTAAACTT TCACTACTAT 
TAATGCTGTA TACACCAATA GACTTTGATA TATTTT C TGT T C T T T T TAAA 
ATAGTTTTCA GAATTATGCA AGTAATAAGT GCATGTATGC TCACTGTCAA 
AAATTCCCAA CACTAGAAAA TCATGTAGAA TAAAAATTTT AAATCTCACT 
TCACTTAGCC GACATTCCAT GCCCTGACCA ATCCTACTGC TTTTCCTAAA 
AACAGAATAA TTTGGTGTGC ATTCTTTCAG ACTTTTTCCT ATACATTTTA 
TATGTAGAAA TGTAGCAATG TATTTGTATA GATGTGATCA TTCCTATATT 
GTTATTGATT TTTTTCACTT AATAAAAATT CACCTTATTC CTTATCATTG 
CTTTATGGTA TTCTGTAATA TGAATGTACT ATAATTTATT TAACTATTTT 
CCTTATTGCG CATTTAAGTT ATTTCTAGTT TTAAAAACAT GCTTGTCAAT 
GGCAACAAAA GCCAAAATTG ACAAATGGGA TCTAATTAAA CTAAAGAGCT 
TCTGCACAGC AAAACAAACT ACCATCAGAC TGAATGGGCA GCCTACAGAA 
TGGGAGAAAA TTTTTGCAAC CTACTCATCT GACAAAGGCC TAATATCCAG 
AATCTACAAT GAACTCAAAC AAATGTACAA GAAAAAAACA ACCCCATCAA 
AAAGTGGGTG AAGGATATGA ACAGACACTT CTCAAAAGAA GACATTTACG 
CAGCCAAAAG ACACATGAAA AAATGCCTAT CGTCACTCGC CATCAGACAA 
ATGCAAATCA AAACCACAAT GAGATACCAT CTCACAOCAG TTAGAATGGC 
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29?01 AATCATTAAA AAGTCAGGAA ACAACAGCTG CTCGAGAGGA TGTGGAGAAA * 

29751 TAGGAACACT TTTACACTGT TGGTGGCAGG AGAATCACTT GAACOCGGGA 

29801 GCGCGAGGTT GCAGTGAGCC GAGGTGCCGC CACTGCACTC CAGCCTGGGC 

29851 GACAGAACGA GTACTCCATC TGAAAAAAAA AAAAAAAGGA CACCAAACTT 

29901 CTCAATCTTA ATGTTCTCAT CTATCTGGTA TCTTCCATAA TCTCTCTCAG 

29951 *CAGAGTCA7 CTTTTGCTCA TATGATCTTA CACTATTTTT TGTTTATACC 

30001 ATTATAATCT CATTAATTCC AGCAACACAA ATGACAAAAG ACAACTGATT 

30051 T CTC C CC TT C GATGACCTAA TTTGCTTTCA CTCTTCCATC ATCACTTATA 

30101 ACATGATGAT 7CTCAAATTC ATCTACCTAA AATCTATATA TAAAAAAATC 

30151 CCTCCCTTGA ATTCCACATC CTTGGAGACA AACACCCAOG TCTAAAACCA 

30201 AA TTTCTTTA ACACTGGACC AGTOGTCCTG TCTCACTTTC CATTTTCTCA 

30251 CTATTTTGTC AGCTGGTATA CCAATATCCA CCCAGTTAAA CAATATTTCC 

30301 TTGTTTTTTT CTGGTACAAA CCCAAATAAA TTACAAACAT CAATAAAAGT 

30351 AAAA7TCTAA AATAACTCAC TTTCTCTATA TATCTCCTTC TTOCTGGAAA 

30401 AATGGGTTAG GTTAGTTCTT TAAAAGCATG CATGATAAAT TGTACTGAAT 

30451 ACAATATTCA GGTCTGGACA TACTASGTAT AATTTTCTGT G7CTCTGGGC 

30501 TCTTACCTAT TTCGGCTCAA AATAAACAAG 7TTATTAASC TTATTAATAT 

30551 TCAATTTCAT TATCTTCTTT AACAATTATG TTCCCTGGTA GTTTCATTSC 

30601 CAATAATTTA TTTCTCAGCT TGOCASGTGC TTCTAAACTT CTGTGTATTT 

30651 TTTCATATCC AATTTTACTT TAAATATTTT TAGAAAAGAG GTCTGTTAAA 

307 01 TTTCCTAATA ATTATTATAT TATTGTTTTT TCACTGACAT TTTGTGAATT 

30751 GAAAACCCTT AAAAATATCA AATCATTTTT TCGAAATATG TGCCACAGAC 

3 OB 01 AATTTTCTTA AATAACAAGA CAGAAACAGC GCATTATCAA GAGATAAATA 

30851 TTCAATATAC CTTATATTTC TCTCACACAT TTTTATACCA ACTGTGCCAA 

30901 AAATTGTATA TCATATAAAT GATAACAACT TCACAAACGC ATTCCTTTAT 

30951 CCCTTAACTC TGAAATTAGA AACTTTCATA CGTAGGAAGT AGGGGAAGCA 

31001 TATATTCCCT TTGAAAGGTG CAAGAAAATG TCATTCGCAT TCACCATCCT 

31051 ACTCTTGAAG CTTAAAAAAA ATOGACTGCA AAACATTTAC AAACATAGCA 

31101 TATTTATTGG GT AC CTTTAT GTTTACATAA ATATTGAAGA TATCTGACAT 

31151 ACCT C TTTCA ATCAGATTAT CICACTGACA TTTATTGAOC ACTTTCTATG 

31201 GGGAAAAC 
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267 T C 

2B< G A 

1269 T C 

2497 T C C 

4486 G A 

4522 G A 

4522 C A 

5015 T G C 

5450 T C 

54 50 t C 

5995 G A 

6241 G A 

9479 C T 

10045 C A 

10045 G A 

11994 G A 

14070 A G T 

1S53S T C 

17619 C 

10520 A - C 

19525 - T A 

18525 - G A 

191B9 T C A 

19259. C T 

19325 G T 

19346 G T 

20945 - T 

20945 T C 

22234 T C 

22234 G T 

22247 C T 

22334 A G 

23033 T 

23936 - A 

23421 A G 

25592 T C 

26407 C A 

26473 C T 

26B4 4 G A 

28394 A 

28417 A C 

29265 A Q 

29404 A G 

30417 T 

30793 C G 

Contact; 

DWA 

Poaition 

267 CCAGCCTCTCTTAGGCTCCTAAATATAGTGCAAAAAGTTCCAGA G TTC C TTTCTTACCCA 



TCiWUVSCJ\C^TGGAAOGGTGCTGGACAGGGGCA>CTG 

CATAGAACTGTCCAAGCC7XACAGGGAGTCACACCACCAGCAAGAACCTGGGTGGGAGTA 
GGTGAGCCAAGGGGTTCCCAGGCTCTGACCCT GCCAAGAGAACTCATT AGAAGGTCACCA 
ACCACACATACTATTCCTCGGTCTCA 
IT. CJ 

GAAGAACCCAGGGACCGGACCAGGCAAGKTATCACAAAGCTGAAGTTTCAGCTCTGGGGC 
AGAG CATGGAT CTGAGGTCTTTGGCCCTACCAOCATCCGATCATATGAGGGCCATCATAC 
AACCATCATGATTTGGGGGAGGAATAGGGCATAGAGGAATCATATGAAAAGCTGAM 
CATGAGTT ACCCAGAAGAAGCT GTGTAAGCGAGAGGATTCTGAGACCCTGT CAAATAACA 
ACATCT AGTTGAAGG7TGGAGTT AGGTAGGAGGT AGGGAAGTCTGGGAAAG AASGAG CTG 

284 CCAGCCTCTCTT AGGCTCCT AAATATAGTGCAAAAA tjV rC CA GAGTTCCTTTGTTACCCA 

TGAAAGCACATGGAACGGTGCTCGACAGGGGCAAC7GGCCCTGGAGCAGAGGAGTAACTG 
CATAGAACTOTCCA\AGCCTCAGAGGGACTCACACCACCAGCAAGAACCTGGGTGGGAGT 
GGTGAGCCAAQGGGTTCCCAGGCTCTGACCC7GCCAAGAGAACTCATT AGAAGGT CACCA 
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16/23 

ACCACACATACTATTCC 7CGCTCT CA TCAAGAACCCAGGGACC 
1G.AJ 

GACCACGCAAWTATCACAAAGCTCAACTTTCACCTCTGGGGCAGAGCATGGATCTCAGG 
TCTTIXJCCCCTACCACCATC0GATCATAT6AGGGCCATCATACAACCATCATGATTTCCG 
GGAGGAATAGCXsCATAGAGGAATCATATGAAAACCTQAAATCCCATGAGTTAOOCAGAAG 
AASCTGTCTAAGCCAGAGGATTCTGAGACXCTtrrC^^ 

GGAGTT AOGT ACGAGGT ACSGAAgrCT GS GAAAGAJVGCAGCTGAAACAiCTTGCTGTCTCT 

1269 CCTGTCTTAATCACTTACCCGCCAAATAAAATCTGG CTCCAGAGAGTGGAGCGTAGGCTT 

AAGGAATTGGGGCCt5CWAGCGOGG(X!AACGTGGCOCACGGACyun^ATWSGCSAGAACAGG 
GAATTGTAGCAGAAATTGGCrTTATTCTTCACACCTTrCAATW 

TCTCTTAGCCTAAATCAATCAATAAATGAATGAATA^TAAATGAATGAAATGTGGGCAA 
TGCCTATAAAGATTCCTGGGACAOSGAGGTGGCCOGAGACACCAGC TT GG GAAGTCAGGC 
IT.C) 

TCTT AGAT CCniGTTCACCACCTCATAOGTTACAAATACT AAAACCAT CAjCTT TCAAATT 
A T I T , r rACTAC^ ni , i1.C » \pl7A TCTCTACTCCACTTTATTr AaGt TT C T GO CATCTAGA 
GTOVGCCCTTC»TGG&CATGAGACCCAAGCA«X>^ 

ATATGCTCGCTTTAATCCTCTCTCATCTTACAATTCTTAATAAA C TT TTTATCCCGCATT 
TTCAT TTT GC ACTGAGATTCATAAATTaTATAGCA GG OCCTGACTGTACCTGTAJAfirrOG 

2 1 87 ACCCA TGGAA TTCTCCTGG CTGGA GACGCGCTGGGCt^^X^CCCTTTT ACCTGGCG TTCJT 

GTTCTGCCTGGCCCTGGGGCT GCTCCAGGCCATTAAGCTGTACCTSCGGAGGGA3CGGCT 

GGTAAAT/GGAAGGGAAAAAGGWTAGAAAAGCACyjAAGAGGGGGGCGG 
AGAGCAGCCXIAGiAXk»CA GAC AGACGCAGC.TrT(^ lCCATCCCrGGGGACC^ivt^GGCTT 
IT.C.G1 

CACCCGCCTTTCCACCCCGGCCTGTGGCTCTTACCATCA m I TCt- f It^ CTCTGGAGAAT 
TG C T TT CCCGC JVGCCCCACACQGAAAGCTCACAAAAGAGGAAG CTT TCGGGGCTGGGAGA 
GAGCTATTTAAASAACCTGAATATGGAAAAAGAAAGOGAGCTGTAACTCAAGTCTCTCTC 
TCATTCCTTCACCAAG CC WCCA CAT G r G r T UL ri T AAAAATAGCATCTTATTCTAAATA 
ACTTATT AGTTGCAGAAAATA TGCAAAATCTATOCCAATCGTTGGCACCCTT AGTCCATT 

4 496 TGTT ATGT A T CTCTA£TCTCTCATGAATAXrVATGTCGTCTGTTGTTTTAATTGAATTCTr 

TTGGCAT CCTTGTCAAAAATCAATTCACCATAAAT GTCAAGGTCTATTTCTGAGTCTTCA 
ATTCTAATtXATTGATCTATATGTCTAT CCTAACTCATGGACACAGAGAGTAGAAGGATG 
G7^ACCAAAGGCTGGGAAGGATAGAGGGGAGCTGGGGGAGGAGGTAGGGAAGGTTAATGG 
GTACAAAAAAAATAGAAAGAATGA 
IG.AJ 

TAA WCCTACTATTTGATAGCATAGCAGGGTGGCTATAGT CAATAATAACTGTACACTTT 
TAAATAAA<3AOTtWAATAGGATTGT7TGCAACTCAATGGATAAATCCTTGAGCM»l^TGGG 
TACCC CATTCTTCATGATGTG CCT ATTTCACA TTGCATGCCT GT ATCAAAAACATCTCAT 
TTACTCCAT AAAT ATATACACCT ACTATGT ATCCACAAGTATTAAAAATT AT AAAT AAAT 
AAATTAT ATAGCTATCCTTATGCTAGTACCACACrGCCTTACTGT TGCTTTGTAGT AAGC 

4522 TGTT A TGTAT CTCTACTGTCT"CATG AATACTATGTOGTCTGTTCTTTTAATTGAATTGTT 

TTGGCATCCTTGTCAAAAATCAA TT GACCATAAAT GTCAAGGTCT ATTTCTGAGT CTT CA 
ATTCTAATC CA1TGATCT ATATGTCTATCCTAACTCATGGACACAG AGAGTAGAAGGATG 
GTTACCAAAGGCTGGGAAGGATAGAOGGGAGCTOGGGGAGGAGGTAGGCAAGGTTAATGG 
GTACAAAAAAAATAGAAAGAA T GAA TAACA CCT ACT A TTTGA T AGCAT AGCAGGGTGG CTT 
|G,A| 

TAG TCAATAATAACTCTACACTtTT AAAT AAA GAGTtTrAATAGGATTGTTTGCAACTCAA 
TGGATAAATGCTTGAGGGGAT GGGT ACCCCATTCTTCATGATGTGCCTATTTCACATTGC 
ATG CCT GT ATCAAAAACA I CT CATT TACTCCAT AAATAT AT ACACCTACT ATGT AT CCAC 
AAGTATTAAAAATTATAAATAAATAAATTATATAGCTATOCTTATGCTAGTACCACACTC 
CCTT A CTGTTG CTTTGT AGT AAGCTTTG AAAT CAGGAA GT A T GAGTCCCOCGCACTTTGG 

4 522 TGTT ATGT ATCTCTA CTGTCTCATGAAT ACTATGT CGTC1GTTCTTTT AATTGAATTGTT 

TTGG CAT CCTTGTCAAAAATCAATTGACCATAAATCTCAAGOTCT ATTTCTGAGTCTTCA 
ATT CT AA T C CA TT G ATCT AT ATG TCTAT OCTT AACTCATGGACACAnAGAGT AGAAGGA TG 

GTTACCAAAGGCTCGC A AG G ATAGAGGGGAGCTCCGQ6AGCAGGTACGCAAGGTTAATC6 
GTACAAAAAAAATAGAAACAATGAATAACACCTACTATTTGAT AGCAT AGCAGGGTGGCT 
(C,A) 

TAG TCAATAATAACTGTACACTTTT AAAT AAA GAGTGTAATAGGArTG TTTGCAACTCAA 
TGGATAAATGCTTGAGG^GATGGGTACCCCATTCTTCATGATGTGCCTATTTCACATTTC 
ATGCCTGTATCAAAAACATCTCATTTACTCCAT AAATAT ATACACCTACTATGTAT CCAC 
AAGTATTAAAAATTATAAATAAATAAATTATATAGCTATCCTTATGCTAGTAOCACACTG 
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5075 



S430 



54 50 



5995 



S2il 



B«79 



CCTTACTU-n Ul 1 : l^fACTAAGCTTTGAAATCASGAJUrrATCJ^CTCCCCCGCACTTTCG 

tttgtactaagctttga^atcagqvagtatgagtccc^^ 

TTATTTTC GCTO T l T Ott *ATCCITGATTTCtXIACftAATTnAGACTttOOCTATCAXrT 
TCTACAAGGAAACCAGCTAGG tg » C 1 GC ri tjUti ATTGCACTGAATCHTT AGATCAGTTTG 
GGCATT ATTGCCAT CT7 AAGAA T A TT AG GTCTTCTGATCCATGAACA CA CAAAGCCTTTC 
OGTTTASTTAQCTCATCTTTAA lTfl ' l ' l T* lt»l fC T TTTTTTTTCTTTTT I GAGACAGAGT 
|T,G,CJ 

CT6CTXrTGT«aCCCAGGCTGSJ*GT<XAG 
TCTtt^TCCAAGCGATTCTCCTrcCCTCAGCX^C^^ 

CCCACCACACCAACTAATTTTTGTATTrTCACTACWCAOCOGG7^*TCACCATAT7GGCCA 
GGCT ACTTCT CGAACTC CTG ACCT CGTGATCCACCCGCCTCACCCTCCCRAAfiTXXTTGGGA 
TTACAGGarrt»GCa«X*CTCCCGGCTTTC^^ T I Tl T 1 AACGAT t.11 Tm/TAT 

GATTCTCCTtXXTCAGCCTCCCfiAGCACCTC^ 

TAATTTTTCTA TJ TT CA XrrAGAGACSG CG TTICACCATATTGCOCAGGCTA(^^TCGAAC 
TCCTGACCTCCTGAT<X*CCtXCCTCACa^ 

CACCIlC7?OCCQS CTCT<^ ff T A ftIT TCl T T T AA0C MCTTT TrGTATTTTTCAAASTATAC 
ATCTTGCA l TT C TTT TC TTAAATTT AJ ' t T O f ITfGl J C 1 T I' M AATTTCATTTCAGACTA 
IT.CJ 

TTTTCTGTTTTTGAGAGGTTT CT ATGAGAATTTTGCAC^TCAGAGATSAOGGACATCTCA 
AAjCTGTCT AATA'n'ACCAACCCTCCCCATTT ATCAGATCAOGATCCT T T T G GTGATTCAC 
CATCCAI^CAAATCTAt^ATCTAAGCCTCAAAAGGTCATACTGTTTTACATAGGCACTAA 
CATTTTATTGCTACATAATAACTACATATTTATGGAGTA^ 

GATTX:TC«GCCTCAGC<7rCCCAAGCAGC?rcGGA^ 

TAATTTTTGTATTTT CAGT AGAGACGGGCTTTCACCAT ATTGGCCAGGCT ACTCT CCAAC 
TCCTCACCT X »TCATCCACCCOCCTCACCCTCCC^ 

CACCACTCCCC GC TTTCT H AA 1 11 II 1 1 t AACGAT G T I T TI - G TATTTTTCAAACTATAC 
ATCTTGCATT TC TTTT G TT AAATTTATT TC TTTT G 'TT C rrTTTAATTTCATTTCAGACTA 
JT,CJ 

TT ATTGCA TTCATAGT GT TTTAGAGTCC ACATT C C CTCTTGACTGT CA !_T AAOTTTTTTT 
TTTTCTGT TTTTGAGAGGTTT CTATCAGAATT TTGCAGATCAGAGATGACG GACATGTCA 
• AA CTG T CTAAT ATT ACCAACCCTCCCCATTT AT CAGA7 CAGGATCCTTTTGGTGA TTCAC 
CATGCACCGAAATCTAGTATCTAAGGCTCAAAAaGTGATACT O !'! I TACATAOGCAGTAA 
CATTrrATTGCT A CATAATftACrACATATTTATOCDUSTACCTtrrGATATTTT<^TAXGTC 

TTATfGCTACATAATAACTACATATTTATGGAGTAOCTGTGAT A TTTTGATAOGTGCATA 
CAATGTGCACTt^TCAAATaUXKTTCTTTA^^ 

TATTTST G TT rGGAACATTTCAAGT CTCTTCAAGCTCTTCAGAAATATTCAATACATTAT 
TGTTAJICAGTGCTATTCAACACT^C^ACTTATTCCTTCTATCTAAAGACAGTAACAT TTT 
AAGTATAGTCATAACGTTACAGAAGGATAAAGTGTGTATAGGGAA^ATTCCCTACAAGAT 
IG.A] 

AC AAT TT C ATI CCTTACTCTT ACT AATACAGGTCTTCAAACATGCCAAGGftT ATTCCTCC 
CrrGGAGCTTI GAACATCCACGTCTGTGGTTATATTGCTCrCCCTG CAAATT ATTCCT AA 
AAGAGGCTTGCCCTGACCATTCAGACTAAAATAGCACCTCTAGTACTCTCTATCT CCAAC 
CXTATTATTATTATCTTGGCtXTTATCACTCTCTO 

TGTTCGTTTAT^ATCCACCACTAACTACAATATAAAATCTGTGAGAGGTAGGATCTTTGT 

AGTCATAAGCPTTAOM»AGGATAAAGTGTOTATAGGGAAAArrC^^ 

TTCATTCCT 1 ACTCTTAGTAATACAX MJ f C T rC AAACATGCCAAGGATAl fCCT CC CTT GG 



GCTTGCtXTGACX^TTCAWCTAAAATAGC^CriCTAGTACTCT 

TATTATTATCTTGGCCCTTATCACTCTCTGACACTATACTG^^ 

(G.A} 

TTTATTATCCACCACTAACT ACAAI AT AAAATCTCTGAGAGGTAGGATC 7 G 1 1 TOCCA 
CTATAAACCTAGTOCATGGTACAGTTCCTGGTGCATAATAGGTGC^^ 

TTGAATC^CATAAATATATTAGGTGCTGAGAAAATTTATTTATTCAAAGATCAATTTACTG 
CATAGAAT AGGCCAGGTCCT TT GA CATTT A TT CAAT AGCCAACAT ATG GGACCT AGCATC 
T ACATATG CAAGTGTGTtn^STGTATGTGTGTGTGCAT CTGCATGTGTACTTGGATOTACT 

AAAGCATGTTATGTCACTTCCAGAAAAGTC«»GGCTCCT^^ 

GGTCCTGAACTCAGCTTC7GTXTATAAGAGGGGACAG3TCCA3CTrGGC7GGCTAATTAC 
TTTT ACTTTTTTCACTGCAGTTTATTCAG<^TGAT AACATGGAGAAGCTTGAGGAAATTA 
TTGAAAAAT ACCCTCGTGCCTT CCCTTTCTGGATTGGGCCCTTT CAGGCATTTTTCTGTA 
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TCTATCACCCACACTATCCAAAGACAiCrrrCTCACCACAACASGTAACAACAGGGCCAAA C 
fC.T} 

TXrn>GOtCC7ATTCCTCCTAGAAGT'GAAATGCATAAAA£CCATAGGCAA<^TTCCAAAGC 
AAAG A 7 I G G7 TTGG<SGCCTrTAA6AGACACAGCAQCAAC7 ATCGOGAGCTGACACGTTTC 
CTACCAATACTGAACGGGATTCCCATATCCTCCCCAGTOCCTT G T C TT C1 1C AOGTATGC 
ATGGGCACCrrGAAGTOCCTATAACTTAAAGCCTAGCTG&a 

AAGGCTTUX J TGCCCTCTGTGGGTTT7 ATGACTTCAGTGTCAGCAACAC'JI CCCACTCC 

1004* TCTGCTTGACTCTGCAGATCCCAAGTCCCAGTACCTGCAGAAATTCT CACCT CCACTTCT 

TGGT ATGT ATCTG ty^AATGAGAGGTAT AACCCACTCTCATTCAAACTCOCCTTTOCATAG 
TASASCATGCCAAAGAAACTGAAATCT GAATTCAAAAGCACAAAGAGTGCAAGGTAGAGC 
TATACTCAACGTTATCT ASGGGAAAGATtGAAGGGGAGCT CTAAGGT CAACACACCACCA 
C T T C C CA <»AAAG C TTCTTCATCC G TTTCTCTCCCACAAXGTCTT AT T C T C AAGCCAGCAG 
(C.A) 

TACA^GAATCTGTCCCCTCTCTCTTTAAAACTACAGCCTTGGCCAGGCACAGTGACTCAT 
GCATCTAATCCCAGCAC 1 1 1XMGAGWXAAGCTGGGAGGATT^UnTGAOGTCAAGATTTC 
AA£^ACCAGCTGGGCCAACATGGTGAAATCCCATCTCTACTAAAAATACAAAAATTACOCA 
GGCATGGTAGCATGTAGCCCTGT AGTCCCACTACTTGGCAGCCTGAGACATGAGAAT OGC 
TTGAACCTAGGAGGT3G 

10045 TgrGCTTCACTCTGCAGATCCCAACTCCCAiCTACCTGCACAAATTCrCACCTOCA CTTCT 
TGGT ATGTATGTGCAAATGAGAgGTATAACCCACTCTCATTCAAAGT O C C CTTI OCATAC 
TAGAGCATGCCAAAGAAACTGAAATCTGAATTCAAAAGCAX^AAAGAGTGCAAGGTAGAiGC 
TATACTCAACGTTATCTAG^GtiAAAGATTGAAGGGGAGCTCTAAGGTCAACACACCACCA 
CTT CCCAGAAAGCTTCTTCAT COGTT TCTCTT CCCACAAAGTCTTATTCTCAAGGCA GCAG 
IG.A) 

TACATGAATCTGTCCCC T C TCTCTTTAAAACT ACAGCCTTGGCCAGGCACAGTGACTCA7 
GCATGTAATGOCAGCACTrTGtKAGGCCAAGGTGGtMGGATCACTTGAGGTCAAGATTrC 
AACACCAGCTGGGCCAACATGOTCAAATCCCATCTCT ACT AAAAAT A CAAAAATT AGCCA 
GGCATGCTAGeATGTAGGCCTGT 

11994 GSTAAGTAAAGGGGGAAACTGCrcrGTGCATTGCGJ^TGCrCCCA^ 

TACGTATGTGTTTTGTGGGCCATtiAAAATAAAAAATCAirrTTCTAAAAATTTAACCAATG 
TACACCTACTTATTGAACAATAGGT GTC T CT AAAAA ATT T OT T AT OT IV TTrGAGTCATA 
ATATT AAT AAAAAGATCTCGTCCTCTCTCTT ACATATATTTTGAGATTTT ATGGCAGCAA 
ACCAAGTACCAAAltSGTGATAGTTACATAGTAAGIXSCTGTAGATGTGTTTCATGGAGGGC 
IG.M 

GGTCTGT ACAAACCTACCCCAAAGTCTGAGGAAACTGAGAGGCTGAAGAAAAAGGCTGAC 
ASTTT CTTAAAAAGAAACATTCAATAGAGGCTTTCAAACAAAAACCAT 

1407O GGTI^GGCTTTGCTGGGGGCAIiCT'CCCTGCA^ 

T CACTTTTGAATCCAAACG TTTT TllAAAA TXTTrCTGACTTT ATTTT AAAATG TGG CT AT G 
GTGGTTGAG A SCAGTGGCAGGGT ACCTAGCAA G T T T OG AA7TCAAGTTGGAGGAAGC CCt 
GtXJGTAAACCOCTTGTAATTATGGt?rCTTCT<STCAATGATTGCT"fTAATGGAACTCTGGT 
CTGTTTGAAAGCAGAG7T ATGGT AATAATTGAAAAGCCGCACATCTTTAACT CAGCCATT 
IA.G.T1 

accttatatccagttttctccatgctccttctcactcogctgcgtgt at t t ' f fcccttcc 
tcgtgccctgtgtaagcacatggcttatttacrcai^tgatcttrggttccrgctgggtc 
agggttgtctccattagatcataaaaacagggccaggcaggagccctcaaatgaaggcaa 
tttg^tcatg^tggtggtcmtgatgttggtcttgacctcctgtsccagxmtaagtgggas 
aagatttgcagcactcag6acacaagogtggaggtctatgagcacatcaactcgatgtct 



15535 ACTTACT G CTTT GTTT CAGTGTCATCTAT AAAATGGAGATT AAAAAASAAOCTATCTCAT 

ACATTTGTTGTTAOGATGA G TG G GTTAATATATATAAAGCATTTAGGACAGTGCCTGGCA 
CTGAATAGATGTTAAATGTAAAGTATAGTTATGTCAAATGTCTTTGCTIT^C^GGAATTTT 
GCAA GA CACAC CAACATAT 3CACACTTACACAT ACATATATG CATACATGCACAT AGAT A 
TTATAAACAGCA^ACTCAGA<^AGCA^XTTATAAA£AATTTAAGGCATAAATGGGGATTA 
[T.C] 

AAATAi^CAGGAGTTCCCAAGTCTTTCTGCATCATTGCACACACAGAAAATGTTAATGTTT 
1TGTGCTTCATTCGJUnAAA<^GCAATGCATTTGCGtSGA^2CTATACAGAAC17TGTAAA 
AAAAAATCTTTACTTTTTAAATA TTATACAA TTATGATGAAAAAGCAAAATGCAAAGTG7 
TAGGGAAAATATTAAATCTTAAATTTATTCAAAACTTAAAA C CTTT T C AATTTT TTTTTT 
TrTTTTTTTTTGAGATGCAGTCTCTATCACT CAGGCTGGAGCGCAGTGGTCTGATCTCA G 

17618 GGTAAITrGGGAATGGGATGGGGAGACAAGAATAAAACCGATTtMCTAAATTTAACTGTAC 
' TTTGAATTGATGAGCAGCTTCATGCAATTTGAGACAAAGAGAGAATTCTGCAACTGTGTC 
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GCTACAGCAGGuTTACTAAACACTAAAOGAAC^TTTGACAACATTTGJlCGATrGTCATA 
TGGATACATGGATTTTAGGGCATCATGAAAAAATGGTCACAT^XATAAACGTAAAAAITA 
TGATGATAAGGTCCTGGGAAATCTGGGJWrrrTCAAGACAATnCTAGGGCCTGTTCATCG 
IC,T,AJ 

GGGCCCTTTGTGCAAGGCCTGCTTTTCTTAICTAACCTTGGTTCTCCTTTATGCTTTGGG 
C AGAATA TOGTTTATAOCACATATTrC TTGAACTCAAT TA^ 

AGCTCTGATTTTT CCCCT CAAATCATTATTGTGGTTGTATCTCCAAACAT1T ATAAACTG 
GCAl^TTArrTAAAATATTTGTATTGTACTTTCTACGATCAAAGTGGTAGCAGCTTCTCA 
GATATTGATCTACACTCTGAAGTGAGOICATTCCTGTTGG 

1 6 520 ATTTATOCATAAATTGTCTGTCA1 1 SCTt PTCI AATCAATGCTGTGTGAAATGTCTTATT 

TCTTTATTTCJtfXriTGGCTCJGATGCATTGGAAAT^^ 

TTASAACTTAAACAATACGGTCCAACTGCAGCTCCTClTCn i AGAGAGCTQUlTGATTAG 
CTGCATTATTTAAGGCTCATTTT AGACAT CTCCCAGCOGCTTGTCACCAATTTTATTCCT 
CAGGATTGATTTT ACACTTCAGACA T AAT A TTCGATGAT ATA T ACT AT AGTT AAGTTT AG 
IA.-.CJ 

AAAT ATGGACTGAGGACATTTTAAATACTGAGAL 1 TT1 TI 1 ATGACTACAATTTATTGTG 
GGCCCTCT C TT CGOTGAGCT AATCGTCT AAT ACAGGAGACAGGAGACAGACCTCCAAATT 
GGAGTGTAGCATAATGAGGGCAATGATAGAGATA7GTGCTGGCTAACACAAAGACATAGA 
AGACA6CTACCTACCCTWCATGGGAGCTCAA<^»GACT^ 
GCAGOATAAGTAGGAGTTAGCCAGGTGGAAACTGTCATCTC^^ 

18525 TCCATAAATTGTCTCTCATTGGTTTTCTAATCAATCGTGTGTGAAAT^ 

ATTTCACCTTG3CTCTGATG CATTGGAAA TGAGGACTTGATCCCTGGGCTGCCACTTAGA 
ACTTAAACAATAGGCTCCAAGTGGAGCTC^TCTTCTXaAGACAGCTGAATGATTAGCTG^ 
TrATTTAAGGCTCATTTTAGACATCTCCCACOCGCTrGTCAC^ 

TTGATTTTAGACTTCAGACATAATATTOIMTWTATATACTATAI^TAAGTTTAQCAAAT 
|-«T,A) 

TCGACTGAGGACJ>TTT?AAATACTGAGACTTTTTTTATGftCTACJUVTnATTGTtgGCCC 
TGTCTT CGCl^aGCT AATGGTCTAATACAQGAGACAGGAGAQIGACCTCCAAATTGCACT 
GTAGCATAATXJAGGGCAATGATAGAGATATGTGCTGGCrAACACAAAGACATAGAAGACA 
GGT ACCT A CCCTGGCATG SGAGCTCAAGGAG ACTTC CTTGACATTT A CG CTGACTCCACG 
AT AAGT AGGAGTT AGCCAGGTCGAAA CTG TCAT CTCTATCTTGCTAGACTTTAAGCATAT 

18525 TCCATAAATTGTCTGTCATTG Q TTTT CTAATCAATGGTGTgr O AAATCTCTTATTTCTTT 

ATTTCACCTTGGCTCT 6A TGCATTGG AAATGAGGACTTGATCCCTGGGCTGG CACTTAGA 
ACTTAAACAA TAGGGTCCAAGTGGAG CTCCTCTT CT GAGAGA G CTGAATGATTRGCTG CA 
TTATTTAAGGCTCATTTTAGACATCTCCCAGCCGCTTGTCACCAATTrrATT CCTCAGGA 
TTGATTTTAGA£TTCAGACATAATATTOGATGATATATACTATAGTTAAGTTTAGCAAAT 
[-,G,AJ 

TOBACTGAGGflCATTTTAAATACTGAGA CTTT rTrTATGACTACAATTTATTGTGGGCCC 
TGT CTTCCGTGAG CT AATGGTCTAAT ACAGGAGACAGGAGACAGACCTCCAAATTGCAGT 
GTAGCAT AATGAG GG CAAT GAT AGAG ATATGTGCTGGCT AACACAAAGACATAGAAGACA 
GOTACCTACCCTCGCATGGGAGGTCAAGGAGACTTCvTrGACATTTAOGCTGACTGCAGG 
ATAAGTAGGAGTTAGCCAGGTGGAAACTrGTCATCTCTATCTTGCT^ 

19169 CT G GTCAAAGGGACAGAAAG ACAGAAATG CT AA GGA GAATTCA GCAGCAGA CCAGAT AAA 

AAACACCATATTTCATATGCAAAAGT CAACTCAATTGAAACATTTGT AAAACCAAATTTG 
ACATTATAAAAGTATATCAGAGA7CTCATTTTATAAGGAAATAGAAGCCCTTTCCTACCA 
TAAACTAAAGATTTAATCTATATAGCAC^AAATACAATGTTGAG^rAATCATTTTrAATTT 
ATTTITTAACTGACAAAAATTGTGCATATACATGTTATATATATATGTATGTGTGTATAT 
tT,C,A) 

TAT ATGA TGT ACAA CATGAT ATTTTGA7 AT A TG 7 ATA CACTG T GGAATGACT AAA TCTAT 
CAA7X3GACATGTTCATTAACTCATACTTATCATTTTTTTGTGGTAAGGACATTTAA 
TACCCTCTTAGCAATTTlCAAGTATACAAAn^TTAGTAACTCCAATXACATATTGTACA 
ATGCATCTCCTAAACTIATGCCT<XTGTCT^CT 

CCCTgT AATCCCCCATTCT CCCACAfiCCCCTGGT AACCACTGTTCT ACTCTCTOCTTCTT 

1 92 i 9 TTTCATATGCAAAAGTCAACTCAATTCAAACATTTG7AAAACCAAATTTG 

ACT hT ATCAGAGATCTt^TTTTATAAGGAAATAGAAGCCClTTXCT ACCAT AAACT AAAC 
ATTTAATCTATATAGCACAAAATACAATGTTGAGTAATCATTTTTAATTTATTTTT TAAC 
TGACAAAAATTGTGCATATACATGTTATATATATATGTATGTGTGTATATATATATGATG. 
T AC AA CATGATATTTTGAT A T ATG T AT A C ACTGT GGAATGACT AAAT CT ATCAATG GAC A 
|C,T1 

GT T CATT AAC TCA T ACTT AT CATTTTTTTGTGGT AAGGA CATTT AAAATCT ACCCT CTT A 
GCAATTTT CAAGT AT ACAAA TTGTT AGT AACTC CAAT CACA7 ATTGT A CAA TGCAT CTC C 
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TAAA I TT A T GC CTOCTGTCTGACTGAAA I 1 ' 1 TG TAT (X TT T GA CTAACATCOCTCTAATC 

GTTTTAGATTTOCMATGTGACATCAlt^CGWTTTCTCrrTCTGTGOCTCGCTTAT'lTC 

1932S TCAGAGATCTCATTTTAT AAGGAAATAGAAGCCCTT TCCT ACCATAAACTAAAGATTTAA 

TCTATATAGCftCAAAATACAATGTTCAGTAATCATTTTTAATTTATTTTTTAACTCACAA 
AAATTGTGCAT ATACATtHT AT AT AT ATATGT ATGTGTGT ATAT AT ATATGATGT ACAAC 
ATGATATTTTGATATATG TA? ACACTCTCGAATGACTAAATCT ATCAATGCTCATCTTCA 
TTAACTCATACTTATCA T I TTT TTCTCgTAAGGACATTTAAAATCTAO CC P CTT ACCAAT 
ICT) 

TTCAACTATACAAATTCTTAGTAACTCCAATCACATATTCTACAATGCATCTdCTAAACT 
TATCOCTCCTGTCTGACrCMAATTTTCTATCCTTTCACTAACATOCCTGTAATCCCCCAT 
TCTCCCACAGCCCCTCCTAACCACTCTTCTACTCTCTGCTTCTT TG ACTTTAATGTTTTA 
GATT TCCACATCTCAGATCATGT1JGAA1TTGTCTrTCT<3TGCCTG(^TTATTTCACTTAC 
CAT AATGTCATOCAAATTC A TCTCT G T rCTCATAAATGACAAGAT AT T TG rCTT TT C T A T 



19 3«G CAAATACAAfl CCCn T CCT ACCATAAACTAAACArrTAATCTATATAQCACAAAATACAA 
TCTTCA6T AATCATTTTT AATTT ATTTTTTAACTGACAAAAAT TGTGCAT AT ACATGTTA 
TATATATATGTATOTCTGTATATATATATCATgTACAACATGATA l 1 J TG ATATATGTAT 
ACACTtJTGGAATGACTAAATCTATCAATGGACATtrrrCAWAACTCATACTTATCATTTT 
TTT<7It3GTAAGGACATrTAAAATCTACCCTCTTASCAATTTTCAAGTATACAAATTClTA 
IG,T] 

TAACTCCAATCACATATTGTACAATXKATCTCCTAAACTTATttCCTCCTX^CTtACTGAA 
ATTTTCTAT C C l ' rTS ACTAACAT^OOCTCTAATCCCCCATTCTCOCACAC CO OCT GG TAAC 
CACTGTTCTACTlTX.TX^.'rTt.-l-ri e^TTTAATGTTTTftGATTTCCACArGTCAGATCAT 
GTGG AATTTCTCTTTCT CTCCCT GCCT? ATTT CACTTAGCAT AATGT CATCCAAATTCAT 
CTCT<rrTCTCATAAATX^CAAGATA l ^T liT X.1 "ri XrrATgGCTAATTgrTACTCCATTGT 



20845 TGTTACTtMiAAOCTTTGTAGATCAGTTGACAATAAATGTCTGGGTGTATTTCT 

TTATC CX ' OH IT A TTACTTTKTATGT C T Cr t I 1 T 1 1 AGAAOCTCTATOCT O VTl rGGTGA 

CTAGAGCTCTGTAGTCAATTTCAGATCAGirrACTATCATGCACTCCAGCTTT^ 

TGCT CAAAATTCCTTTCGCT ATTTGAGTTTTTTT ATTCCATACCyUVTTTT AGGGCTTTTT 

TTTTTTTTOGWTT ACTGTGAATAATGCCAITGGAATTTTGATG GAGATTCCATTGAATCT 

f-.T] 

TGGGTAGTATCC^TATTTTAACAGTATTAATGCTTCCAATrAATGAACACAWJCTATTTT 
GCAAT I y &IG T TTT C7 T CAA TTT' C TT r CACCAGT CT ITT! TTCTT AATTTAATr G TTTTA 
TTTC CA T AGGGTTTGGG T AACAGGTGGTG TTTGGTT ATXJAGT AAGTT CTTT AGTGGT GAT 
TTGTGAGATITTGATCCACCCATCACCTAAC CAGTATACACTGTACCCAATTTGTAGTCT 
TGTATOCCl^CCTCOCTtXX^CCATTTCCCCCAAG 



20B45 TCTrACTCGAAOCTTTGTA<^TCAGTT(^CAATAAATGTGTCCGTCTA TrfC* rgCACTCP 
TTATCCT O TTTTATT AGTTT ATATGT CTCTTTTTTTAGAAGCTCTATCCTGTTTTOGTGA 
CT AGAGCTCTGT AGT CAATTT CAGAT CAG GT AGT A TGAT GCACTCCAGCTTT GCTCT TTT 
TGCTCAAAATTGCTTTGGCTATTTGAGTTTTTTTATTCCATACGAATT^ 
TTTT TTTTO^TTACKTrGAATAATGCGATTGGAATTTT^TtKWGATTGCATTGAATCT 
(T.C] 

TGGG TAGTATGGAT ATT^T AACAGTATT AATgCT TCCAATTA ATGAACACAGGGT ATTT T 
TTTCCATAGGGTTTGGGTAACAGGTt^TCTTTGGTTATX^AGTA^ 

TTGT GAGATTTTGATGCAC C CATCACCTAAGCAGT A TACA CTGTACCCAATTTGTAGTCT 
TGTATCCCWatfXTCCCTCCCW^TTTCCCCCM^ 



22234 AGAAACTT TTT AGTTT AATTAAG T CCCACCT AT TT ATCTTTTCGTTGTTGTTCTTTTTTG 

GGGTTGTTTTGTTrTGGCTTG G 1 ' T J* 1 GCATCTCCTT TTGGGTT CTTGCTCATGAAST CTT 
TGCCT AAGCCAAT ATCT AGAAGGGTTT TT CTGAT GTTCT A GAA7TTTT ATGGTTCAG GT C 
TTACATTTAAGTCCTTGATCCATCTTGAGTTS A TTT T TGTATAAGGTGAGJ^AT^AGGAT 
CCAGTTTCATGCT TCTACATGTGGCTTGCCAATT AT CCCAGTACAATTTGTTGAATAGGG 
[T.C) 

T AA T ATTT AAAGCTTT AT AT ATTT AGGTGTTCCT A TTTTGCCT ACAT ATTT ATTT AC AA C 

T ATCAT ATOCTCCTGATGGATTGACCCCTTTCTCAT TATATAATGGTCTT CT T G TCT CT T 

TTTACAGTTTTTCTCTTAAAGCCTAATTTGTtrrGATAAAAGTrCAGCTA 

"XYTT GGTTTCT A TTTGCATCGAAT ATT TTTTTC CAACCC7 TCG CA TT CACT CT ATGT GT G 

TTCTT AAAGATGAAATGAGATGCTGTAGGG<XATATGCTTGG O ' J^ ~ rG TTTI ATTCATTC 



22234 AGAAACTTT*TT AGTTTAATT AAGT CCCA CCT ATTT ATCTTTT CGTTG TTGTTGTTTTTTG 
GGGTTCTTTTGT MTtil'.C TGGTTTT&CATCTGCJ T7 IxSGGTrC'IT GGTCATGAAGTCIT 
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TCCCTAA3CCJUXTATCTAa»AC«7TTTTTCT«TffrrCT^ 

TTAGAT1TX*CTCCTT l GATCCA,TCTyC* | J^* T ^*TTTTT GT AT*AA<yrGACAGATCAOGAT 
CCAGTTreaT GC TTCT A CAT G r G^CT T GC CAATTiyrCCCmTA^ A rT T G 7 TGA ATAGGG 
ICTI 

TWVTA1TTAAAGCrrrATATATTTACGPGTTCCTATTTrGGCTACJlTATrrATTTACJU«; 
T ATCAT ATCCT CCTG ATGGA TTGACCCCTTT CTCATT RT * T AA TGCTCTTT7TTGTCTCTT 
TTTAO fi 1 * 1 IT T t»I CI rAAAQOCTAATTTCTCTGATAAAAGrTCAGCTACCTTTCCTCTC 
TT r T U. T T T' C I 'MTTGCATGGAAT A TT lT r TTCCAACCCTTOGOVTTCACTCTATCTGTC 
TTCT7 AAAGATtSAAATGAGATGCTGrTAOGGGCAT AT GC n ' {Ada I LI W> IT T T ATTCATTC 



22247 TTTAATT AAGT OCCACCT ATTT AT CTTTT CGTT GTTG T T l V nTlT-J GGGTTGTTT T GTT 
TTO6CTTGCTTTT6CATCTGCTTTTGC0 fTCTl CCT C*T"5AAG*TCTTTGCCT AAGCCAAT 
ATCTAGAA GGG T I T T *f C T GA TGT^CTAGAATTTTTATGGTTCXGGTCTTAGATTT AAGTC 
CTTGATCCATCTTGAGTTGATTT7riGTATAAQCTGAGA« 
TCrACATGTSGCTTGOCAATTATCOCACTACAATTTG^ 
IC.Tl 

TTTATATATTTAGGTCTTTCtrrATTTTGGGTACATATT^^ 

TtrATOSATTGAaXCTTTCTCATTATATAAT U. ' l CITCT *f CTCTCtTT*n ACA G TTTr TC 
TCTTAAAGCCTAATTTCTCT^TAAAAGTTCAGCTACCI VTljCTV fVT nTGGTTTCTAT 
TTCCATGGAATATTTTTT T CCAACCCTTCSCAT^CACTCTAJCTGTGTr C 'n AAAGATGA 
AATCIW*ArCCTOTACGCCCATATCCTTQCCT C T*l CT ITl'ATTCATTCATTCAGCCACOCT 

2233* CrrrC'IT<jbTf^TGAAGTCTTT GC CT A ASCCAATATCrA(^ACG GI TTTTVTGAT'GTTCTA 

GAATTTWATCCTTCAGGTCTTA^ATTTAAGTCCT^ 
ATAAGGTCAGAGATGAGGATtXJ«?TTTC»TGCT7CTACA 

CT ACAATTTGTTGAA T AGGC TT AATATTTAAAGCTTTAT AT ATTT AGGTG TTCCT ATTTT 
CCGT ACAT ATTTATTT ACAACT AT CA T AT CCTC CTGATGGA IT GACCCCT TTCTCATT A 1 
[A.GJ 

T AA CTTTTT ACAGTT TTTGTCTT AAAGCCTAArT TG TCTGAT AAAA 
CTTCACCTACCrmCTCT C TlTT OG T r rCTATTT<^ T rCC AACOCT 

TCG CATTCACT CT ATCTCTG TTCTT AAAGATCAAATnAGA T QCTCT *G GGGCA TATGCTT 
G GGT CTTGTTTTATTCATTCATTCA€CCACCCr7TTGATTM 
ATTCAACOTAATTATrCACAGACAAGGACTTA^ 



2 3033 AT linT T Vri rarTCTACTATA GU T I T Ti^Ct T re^ 

ATAGTrATAAAAWXTTATTTTAAACTGATAACAGCTT^ 

ATACA CTTTT ACTCT A CCAACTGCOCTCCA TTT T AT<»TCTTTGA TGTCATAATTT ACCTA 
GTTTTGCAGA T GT GTCCCCTT ATTCT GTATCXXTTT AACAAATT ATTGT AGCAACAGT CA T 
TTTTAATAGTTTTGtKTTTTAACTTTATACTAGAGATAG^ 
IT. -J 

ACATTATTAGGOTAtTCTAAATrCACTATOTATTTA£XTrr ATCAGTGAG A T T TT T G TT 1 
T^AATTTTC^TGTTtrTTAATTACTATTCTTTCATTTCAACT^ 

TTTTTTGT AAGATGOGTCT ACT AGTGGTSAACACCCTCAAC TT T1 * g> ' l T TA TCTGGAGATG 
TCTTTACCTCTGCTTCATTTTGAAAT ATAAUTlTTtftTCCATGATTSAAATGGACAAAAT 
T tiTTl T TT f AATTATGCAAAGTCCCAGGGTAAGCAGAATTACT C TTr T r 1 1 1 TTTTTCTG 



23036 TTTTtrT TGCTCTACTAT AGt» Jtl f l^SC'l'f TXJTtAn^'ACCATCIAGGGTT'ACATAAAGCATA 

GTTATAAAAGGCTATTTTAAACTXMTAACAGCTTAACTTTCAACACTC 
CACTTT T ACTCT A CCAA CTGC CC TOCA7T TTATGTCTTT GATGTCAT AAT7T ACCT AGTT 
TTGGAGATtrTGTCtXXTTATTGTCTATCCXTTAACAAATTATTGT 
TAATAGTTTTOGCTTTTAACTTTATACTACAGATAGAATTAAT^ 
(-.A) 

TT A TT A5GXTT ATT CT AAATT £AC T AT GT ATTT ACCTTT ATCAGTGAGA TT TTT STTTT CA 
ATTTTCATGTTGTT AATT AGT ATTCTTTCA TTT CAA CTTGGACAATTCACATTAGCATTT 
TTTTJTAAGATGCGTCTAGTAGTGGTGAACACTXTCAACTTTTGT^ 
TTACCTCTGCTTt^TTTTGAAATATAACTTTT^ 
TTTTTTAATTAra-JWUJireCAGX^^ 

23422 CTT/TCATTTCAACTTGGAGAATTCACATTACCATTTTTTCT 
TGAACACrCTC^CrrTTTtrrrrATCTGGJ^ 
TAACTTTTGTTTCCATGATTGAAATGGACAAAATT^tTTT^ 
GCTAAGCAGAATTACT C TTTTTTT n " rJ T VC T 
GCTCGASTGO^GGCGCAATCTXrrCAGCT^^ 
IA.OI 

ATTCTTCT GCCT C AGC CTT CCTGAGT AGCTGSGATT A CAGGCATGCACCA CCA7GCTOGG 
CT AATT TTGCATTTTT AGT AGAGA70GG3 G TT TC TOCATGTTQGTCAGGCT 
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2SSB2 



26401 



26473 



2SB<« 



2B384 



28417 



ACCCGACCTCAGATGATtrOGCCCACXTAXSXCTCCC^ 

GCCACTGOC^CTGGCCAGAATTACTCTTATrTATCCTGAGCTTCAGGAAGAAACAATTC* 
AAATTAAAATTTC^CATTACCTAATGGCCAAAGCCTGCATTCAAAATAAGTAATCASAW 

CCCAAAGGCACATAGCCACT^GCAGCAAAGCTAAGCCt^ 

CJ^CCCCAGGCTCTCTTCCATTGTGGGACAT CATTTCT AAGAT AATCT TT^TTl tJCCTCAG 
TTTGAGACCGAGCTGAAACTTCATGGAAAATAGCACCAGCATCTTT ATCTGAAAGACCAA 
CCGGGATCTTTOGCCTCATCATC^TAATATCAOCCTTATAAATATACAACATTrAATACT 
TAATATAGAGCCTIty«aCCCATTATCTCAT^^ 
IT. C| 

GCTTAT*CAATGATTTACAGTTCACTGAACACTTTTAAGTACTTTCAATCTGGCCCAAAA 
TCCAGACGCAJSCOCCAATGTGTACATGACATTAACTt^T^TGAGCAGAGCTAGAACTTGT 
OCCGAGACCCTCACTCTOGAGOCTACA C Tt C T TCO GAACAACACAGGrTTCT C ACCAGOG 
CT7 AT AGGAA^2CAGAGSGGTCATGTGAGACAT ATT AT^TGATTCAATGTTCTATt AATTC 
ATGT CTT AGGAAGC AAGCCAACAGGATTGCTT CTGGCAAACACCT A CAG CCTGTT ACTGT 

CCTCTOU<GACAjCL&CTCTTCCTATGTAGCC^ 

TCACT^AACCTCTGOCTCCCAGGTTTAAGCAGTTCTCCTCCCTCAGCCTCCCGAGTACC 
TGGGATT ACAGGTGCACAOCAOCOCTGGCAAA 1 T f TT G TATTTTTATT AGAGATCGCGTT 
TCACCATGTTGCCCAGGCTACTCTCAACCTCCTCA TCTCGAGACCAQCCCTCCTCASCCT 
CCCAAAGCGCTtXiGACTACAIXXZATGAGCCACTGCACCCAGCC^gTTCTGTt Kn^Tf ATA 
!C,A) 

CTAAATTGTCTCCAGGACTCC7rrAATAGTCCATrAATAGCTATTTAOGCCAMCACACTG 
GCTCACCCATATAATCCCAATAT 1 1 T G r & ACAOCAAGCTGGGAAGACT GC TT & AAGTTAG 
GAGTCTGAGACTAGCCTGGGCAACATAGGGAGACCCTGTCTTTACAAAAAAAAAAAAGAG 
AGAGATAGCGACCCAT6 G T G TT GCATCCTTGTATTOCT G CCTACTTGGGGGACTGAGGCA 
GGAGCATCACT TCAGCTCACAACTTCAA06TT ACCCTGACCAATGTTCAC<^CACTGCTC 

CAACCTCT<K3CTCCXXAGGTTTAJUSCA G TTCTCCTGCC 

TACAGGTGCACACCACCCCTGCCAAAT TTTT G TATTT TTATTA6AGATGGGGTTTCACCA 

TXHTG«X3U5GCTAGTCTCAAGCTCCTGATCTCGAG3tf^^ 

QCOCTG9SACTACACCCAT(UCCCACItXACCC^ 

TTCTCTCCAGGACTCCTTAATAGT<^T7AATAG^ 

IC.TJ 

GCAT AT AA TCC CAAT ATTTT GTGACAC CAAGGTGGGAAGACT GCTT GAAGTT AGCAGT CT 
GAGACTAGCCTGGG<^CATAG«yUGACCCTWrm 

AGCCLAGGCATtyrH3TTG(^TGCTT*G T ATT CCT GCCTACTTGGGG GA CTGAGGCAGGAGGA 
TCACrTGAGCTCAGAAGTTCAACGTTACCCnXJ^^ 

CTGATTGACAG TCCAGACCCTGACTCT AAACAAAAACAAAAAACAAAT ATTTAAGT AATT 

TOSG CAACATAGGCACACCCT CT CT TT ACAAAAAAAAAAAAGAGACAGAT AGCCAGGCAT 

GGTGTrGCATGCTrGTATTCCTrcCrACrrTCGGGGA^ 

TCAGAACTTCAAGCTTACCGTCAGCAATCTTCACGCCACT 

GGCCAGACCCTGACT CTAAACAAAAACAAAAAACAAATATTT AAGT AATTTCC AAACAT A 
GCAGAAAATAT AAGCATGGTTTATCACTTTGA TATGACACCAACAGCTACTTAAGAT AGA 
(G.AJ 

TCAT GAATTCAGT AAATTGTTGTGTGGAAAGCTAAGGTGCCAACCCAAQCCGCATCTTC7 
TAGGTCCT CCTCACTGGTGTCATCAG CTACACCAGGCAGAG CATTGCCAGGAGCTAGCTC 
TTCCCTTCAAGAACAAAAGTCTtGTTT AAGAG CACAGT AGCCCACAACTTGCTCTTTCTC 
CTG CAtiTCTCT TTT A 1TTC CCTCCTT T CTT AGGGATCACOGTGGTT CTTAGTATTTG GGG 
TCTTCACXACAACCCTGCTGTCTGGAAAAACCaiAAGGTA 

CTTCCAQGCAACCGTAGATCTTGCTGCCTATTTGAGCCCCAAAGGATCAGTT 

AAAGGACAATCGTATTCTCTGTCACATC X r r T rr G GCCATGCCTCAAAAGCAGTCCCACA 

ATGTAASCTnCTGCTCATAGGCTCAATGCAGTCCAOCTTCAAAGCAAGAGAAATAATTT^ 

ATCAGTAACTCCAACTGCCGCCTTGT TAT ACGGAAGGCATCATGTT GGA fiCCTCCCAGCT 

CAAATTCTCACAGTGAACAATTTAAGTeTAAA^STTCSULAAGTTTCAATGG 

(A.-) 

AAAAATATCACTTTACTGTGCACTTCA3ACTTCTTGTACTAOTATTTTACTATAGTCAGA 
JIGAAA CAT CATTTTTT CAAGT ATCAC TIT CTT i CCCTCTTGTCTT CAGSAACTGCATTtiG 
GCAGGA G T TTGOCATGATTCAGTTAAAGGTAACCATTGCCTTGATTCT G CTCCACTTCAG 
ACTGACTO^VGACCtCACX^GGCCrCTTACTTT^ 
GAATGGGATOTATT/TGCACCTGAAGAAACTCTCTGAATGTT^ 

GAGCCCCAAAGGATtiACTTAGTTTTACAJUiGCACAATeGTATTCTCT 
TSeCCATGCCTOUUUWCAGTOXACAATGTAAGCTACTG^ 
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CAOCTTCMAGCAACACAAATAATTTCATGAGTAACTCCAACTGCOGCCTTCTTATAOSC 
AAGG CA TCATGTTGGAGCCT CCCAGCTCAAAT TCT CAC AiTTJAACXfi TTT Aw\GTCT AftAG 
TTCAAAACTTTCAATGGCATTTGGT(KAfcAAAATATCACTTTXCTGTCTACTTCAGACTT 
IA.C] 

TTGT ACTAGTATTTT ACT ATAGTCAGAAGAAACATCA I T il T I CAACTATCAC1TTC7TT 
CCCTCTTGTCTTt^GGAACTGCATK^^ 

CATT5CCTTGATTCTGC7 CCACTTCAGAGTGACI CCAGACCCCACCAGGCCTCTTACTTT 
CCCCAACCATTTTATCCTCAACOCCAAGAATGGGATGTATTTCCACCTGAAGAAACTCTC 
TCAATGTTAGATCTCAGCGTAC^ATGATTAAAOCTftCTTT C TTTTTCGAAgTTAAATTTa 

29265 TATCCAWn*ATAAGTCCATCJTATCCrCAC7CTCAJ^^ATT0CCAACACTAGAAAATCAT 
GTAGIWTAAAAATTTrAAATCTCACTTCACTT*t^CGACATTCCAItJOCCTGACCAATCC 
TACTG C TrT f CCTAAAAACAGAATAATTT GC TCTCC A TTCTTTCAGACT TTTTCCTAtAC 
ATTTTAT ATGT AGAAATGT AGCAATGT ATTTGT AT AGATGT GATCATTCCT AT ATT GTT A 
TTCIA!TTTTrT CACTTAATAAAAAITCACC- Tf ATT CC TT A TCfl WA.1 1 ' Tft TGGTATTCT 
£A, 6) 

TAATATCAATGTACTATAATTTATTTAACT A l T l ' T CCTTATTGGGCATTTAACTTATTTC 
TAGTTTTAAAAACATCCTTGTCAATYU^AACAAAAGCCAAAATTGACAAATGGGATCTAA 
TTAAACTAAAGAGCTTCTCCAXAGCAAAACAAACThCCA?CACACTGAA7^iGSCAGCCTA 
CACAATOaauaUUUOTTrrSCAACCTACTCATCTGJ^ 

ACAATGAACTX^AACAAATCTACAAGAAAAAAACAACCOCATCAAAAAGTCGGT G AAGCT 

294 84 CTSATCATTCC7 ATATTGTTATTGATT TTTTTCACTT AATAAAAATTCACCTTATTCCTT 

A TCA.J T UC TI'T ATGGTATTCTGTAAT ATCAATGT ACT AT AATTT ATTT AACTATT TTCCT 
TATTGGGCATTTAAGTT A TT rC TAGT T T T AAAAACATGCTTGTCAATGOCAACAAAAGCC 
AAAATTGACAAAT 3QCATCTAATT AAACTAAAGAGCTTCTCCACAGCAAAACAAACTACC 
ATC^CACTGAATGGGCAQCCTACAGAATGGGAGAAAATTTTT^CAACCTACTCATXTGAC 
IA.CI 

AACGCXTAATATO^VCAATCTACAAT^MCTCAAACAAATGTACAAGA 

CAT CAAAAAGTOGGTGAAGCATATGAACAGACAC JtC rCAAAAGAAGACATT TACGCAGC 

CAAAAGACACjma^AAAAATOCCTATCOT 

CACAATCAGATACCATCTCACACCAGT^AOAATGGC^TC^ 

CAGGTGCT(K^GAGGAT(7TGGAGAAAT AGGAAGACTTT T ACACTGTTGGTGGCAGGAGAA 

3041? Al^CATCTACCTAAAATXTATATATAAAAAAATCCCTCCCTTGAAnCCAGATCCTTGGA 
GACAAACACCCAarrCTAAAACCAAAT^GTTTAACACT^^ 

TTTCCAT7TT GT CACT A TTTTGTCAG CTG ST AT ACCAAT A TCCACCCAGTT AAA CAAT A T 
TTCCrT C rrTrrTTCTGCTACAAACCCAAATAAATTACAA 

CTAAAATAACTCACTTTCTCTATATATCTCCTTCTTCCTGGAAAAATGGCTTAGGTTACT 
[T.-J 

CTTTAAAAGCATGCATCATAAATTCT ACTGAATACAATAT7CAGGTCTGGACATACTACS 
TAT A A TTT TC T G TGTCT CTGGGGTCTTACCT ATTTC GGGTCAAAATAAACAAGTTTATTA 
AGCTTA7T AATA TTCAATTT CATT ATCTTCTTT AACAATTATGTTCCCTGGT AGTTTCA7 
TtKCAATAATTTATTT^naiCCTT^CACtrrGCTTCT 
TCCAATrTTACTTTAAATATTTTTACAAAAGAWnCTOTTA 

30783 TTTTCTCTGT CTCTCGGGT CTTACCT ATTTCGGCTCAAAATAAACAACTTTArTAACCTT 

ATT AAT ATTCAATTTCATT ATC TTCT T TAACAATT A TGTT OCCTGGT AGTTTCATT QCCA 
ATAATTTATTTtTTCAGGTTGCCAGGTGCTTCTAAACTTCTGTtrr^ 

TTTTACrTTAAATATTTTTAIWUVASAGGTCTt^TAAATTTCCTAATAATTATTATATTA 
TT' G TTTTTTCACTCACATTTTGT GAATTGAAAACCCTTAAAAATATGAAAT CATTTTTTC 

EC.GJ 

AARTATGTGCCACAGA CAArTTTGTT"AAATAAGAAGACAGAAACAGGGCAT7ATCAAGA3 
ATAAATATTCAATATACCnATATTtCTGTCACACATTTTTAt^CCAACTQTGCCAAAAA 
TTGTATATCATATAAATGATAACAAGTT^CAAAQGCATTCCTTTATCCCTTAACTCTCA 
AATT AGAAACTTTCAT AGGT AGGAAGT AGGGGAAGCATAT ATTCCCT7TGAAAGGTGCAA 
GAAAA TG TCATT GGCA TT CAC CAT GGT ACTCTT CAAGCT7 AAAAAAAATC5ACTG C AAAA 
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SEQUENCE LISTING 

<110> PE CORPORATION iWt) 

<120> ISOLATED HUMAN DRUG -METABOLISING 

PROTEINS, NUCLEIC ACID MOLECULES ENCODING HUMAN 
DRUG-METABOLIZING PROTEINS, 
AND USES THEREOF 

<130> CL0C0897PCT 

<140> TO BE ASSIGNED 

<m> 2001-io-os 

<150> 60/241,745 
<15I> 2000-10-23 

<150> 09/139,456 
<151> 2000-12-19 

<150> 09/818, 647 
<151> 2001-03-28 

<1S0> 09/852,067 
<151> 2001-05-10 



<160> 4 

<170> TastSEQ for Windows Version 4.0 

<210> 1 

<211> 2327 

<212=» ONA 

<213> Huaian 

<400> 1 

cgc-gcctgcc tcctctcccc aggcctgagc tgcccctccc actgcctttc cttcttccsg 
cgagtcagaa gcttcgcgag ggcccagaga ggoggtgggg tgggcgaccc tacgccagct 
ccgggcggga gaaagcccac cctctcccgc gccccaggaa accgccggcg ttcggcgctg 
cgcagagcca tggaattctc ctggctggag ac'gcgctggg cgcggccctt ttacctggog 
ttcgtgttct gcctggccct qgggccgctg caggccatta agctgtacct gcggaggcag 
cggccgctgc gggacctgcg ccccttccca gcgcccccca cecactggtt ccrttgggcac 
caqaagttta ttcaggatga taacatggag aagcttgagg eaattattga aaaataecet 
cgtgccttcc ctttctggat tgggcccttt caggcatttt tctgta^cta tgacccagac 
tatgcaaaga cacttctgog cagaacagat cccaagtccc ggtaccigca gaaattctca 
cctccacttc ttggaaaagg actagcggct ctagacggac ccaagtggtt ccogoatcgt 
cgcctactaa ctcctggatt ccattttaac alcctgaaag catacattga ggtgatggct 
cnttctgtga aaatgatgct ggataagtgg gagaagattt geagcactca ggacacaagc 
gtggeggtct atgagcacot caactegatg tctctggata taatcatgaa etgcgctttc 
agcaaggaga ccaactgcca gacaaocagc acccatgatc cttatgcaaa agccatattt 
gaactcagca aaotcatatt tcaccgcttg tacagtttgt tgtatcacag tgacataatt 
ttcaanctca gccctcaggg ctaccgcttc cagaagttaa gccgagxgtt gaateagtac 
acagatacaa taatccagga aagaaagaaa tecctecagg ctggggtaaa geaggataae 
1020 

actccgasga ggaagtacca ggattttcrg gatattgtcc tttctgccaa ggatgaaagt 
1080 

ggtagcagct tctcagatat tgatgtacoc tctgaagtga gcacattcct gttggcagga 
1140 

catgacacct tggcagcaag catctoctgg atcctttact gcctggctct gaaccctgag 
1200 



1 



60 

120 

180 

240 

300 

360 

420 

480 
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catcaagaga gatgccggga qgaggtcagy ggcatcctgg gggatgggtc ttctatcact 
1260 

tgggaccagc tgggtgagat gtcgtacacs acaatgtgca tcaaggagac gtgccgattg 
1320 

attcctgcog tcccgtecat ttccagagat ctcagcoogc cacC'occtt cccogatgga 
1390 

tgcacattgc ctgcagggat caccgtggtt cttagtattt ggggtcttca ccacaaccct 
1440 

gctgctgtct ggaaaaaccc aaaggtcttt gacccettga ggttctetca ggagaattct 
1500 

gatcagagac acccctatgc ctacttacca ttctcagctg gatcaaggaa ctgcattggg 
1560 

caggagtttg ccatgattga gttaaaggta accattgcct tgattcrgct ccacttcaga 
1620 

gtgactccag accccaccag goctcttact ttocccaacc ettttetcct caagcccaag 
1€80 

aatgggatgt atttgcacct gaagaaactc tctgaatgtt agatctcagg gtacaatgat 
1740 

taaacgtact ttgtttttcg aagttaaatt tecagctaat getccaagca gatagaaagg 
1800 

gatcaatgta tggtgggagg attggeggtt ggtgggotag gggtctctgt gaagagatcc 
2B60 

eaaatcattt etaggtacac egtgtgtcag ctagatctgt ttctatataa ctttgggaga 
1920 

ttttcagatc ttttetgtta aactttcact actattaatg ctgtatacae caatagactt 
1980 

tcatatattt tctgttgttt ttaaaategt tttcsgaatt atgcaagtaa taegtgcatg 
2040 

tatgctcact gtcaaaaatt cccaacacta gaaaatcatg tagaetaaaa attttaoatc 
2100 

tcacttcact tagccgacat tccatgccct gaccaatcct acrgctttlc ctaaaaacag 
2160 

aataetttgg tgtgcattct ttcageetct ttcctotac* ttttetotgt ageaatgteg 
2220 

caatgtattt gtetagatgt gatcattcct atattgttai tgattttttt cacttaataa 
2280 

aaatteacct tattccttaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 

2327 
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ccagcctctc ttaggctcct 
tga«ageaca tggaacggtg 
catagaactg teeaagcetc 
ggtgagccoa ggggttccco 
accacacato ctattcdcg 
aeaaagetga agtttcagct 
catgcgotco t-Btgagggcc 
gaggaatcot atgaaaagct 
aggattctga gaccctgtca 
togggaagtc tgggaeagno 
caaggggcca ggacgaactt 
tttttttttt tttttttttt 
ggcgcgatct cggctcactg 
gcctcctgag tagctgggat 
ttagtagogr. tgggg*t?.ea 
tccgcccgcc tcggeetcce 
ccctggagec tgtcttaatc 
1020 

gtaggcttaa ggaattgggg 
1080 

agaacaggga attgtagcag 
1140 

catatgcctg tcttagccto 
1200 

gtgggcaatg cctataaaga 
1260 

agtcaggcct grtagatcct 
1320 

tttcaaatta tttttactac 
1380 

geatctageg tcagcccttc 
1440 

cagaagagca tatgctcggt 
1500 

tcccgcactt ccattttgca 
1560 

gtategtgga at*actatat 
1620 

ogtgccctcg tatcggcagc 
1660 

ateagttgta aaccatttac 
1140 

atggtacacc tccccctacc 
1800 

agoooatctc ttgcotegcc 
1860 

otttagcatt attttacaat 
1920 

agaggacgct ccaggagcgc 
1980 

ccggctagaa atcccgcacg 
2040 

actgcctttc cttcttcccg 
2100 

gtgggcgacc ctacgccagc 
2160 

aaccgccggc gttcggcgct 
2220 

gcgcggccct tttacctggc 
2280 



aaatatagtg eaaaaagttc 
ctggacaggg gcaactggcc 
agagggagtc acaccaccag 
ggctctgacc ctgccaagag 
gtctcatgaa gaacccaggg 
ctggggcaga gcatggatct 
atcatacaoc catcatgatt 
gaeatgceat gagttaccca 
aataacaaca tctagttgaa 
ggagctgaaa cacttgctgt 
ggtccagatg aagtcaccac 
tgagacggag tctcactctg 
caatctttge ctctcgggtt 
tacaggcgcg cgccaccacg 
ccatcttggc caggatggtc 
aaattgctgg gattacaggc 
acttaccogs caaataaaat 

gcggaagggs ggggaaggtg 

aaattgggtt tattgttcag 

aatcaatgoa taaatgaotg 

ttgctgggas agggaggtgg 

agttcaccac ctgatacgtt 

attttcctgt tatctgtact 

atgggcatga gaoccaagca 

ttaatggtct gtcatcttag 

ctgagactca taaattatat 

gatggtacg= tactgtgcat 

ttgaactagc tcatggtaca 

cggaacacca ctaggcaggc 

tctaecacct gggaattttg 

atttotoott tgtgotaagg 

ataaattcag atcccgtgac 

aaaag=agtt gggccgaacg 

cgcgcctgcc tcctctcccc 

cgagtcagaa gcttcgcgag 

tccgggcggg agaaagccca 

gcgcagagcc atggaattct 

gttcgtgttc tgcctggccc 



cagagttcct ttgttaccca 60 
ctggagcaga ggagtaactg 120 
caagaacctg ggtgggagta 190 
oactcattag aeggtcacca 240 
accggaccag gcaagatatc 300 
gagg~ctttg gccccaccac 360 
tgggggagga atagggcata 420 
gaagaagctg tgtaagccag 480 
ggttggagtt aggtaggagg 540 
gtgtggctta atggaacatg 600 
eccctggggc ctgtcttttt 660 
tcaccaggct ggagtgcagt 720 
caagcgattc tcctgccxca 730 
cccagctaat tttagtactg 640 
ttgatccctt gecctcgtga 900 
gtgagccacc gegcccggcc 960 
ctggctccag agagtggagc 

ggggagggac agtgataggg 

agctgtcaat gaacacttaa 

aaraaa^aaa tgaatgaaat 

gqggagacac cagcttggga 

acaaatacta aaaccatcac 

cgagtttatt tatgtttclg 

gccacacgag gctctgaacc 

aactgttaat aaagttttta 

agcaggccct gactgtacct 

atcttccccg ttcagtgttc 

cgctgggaat cagggtggga 

cacaggatai aggaataatg 

gtagaacgcc agaatggaaa 

aagaaaaeca otgacctcag 

tgaaaactgt tggacttaaa 

aagcgtgcgc gctttggtaa 

oggcctgagc tgcccctccc 

ggcccagaga ggcggtgggg 

ccctctcccg cgccccatga 

cctggctgga gacgcgctgg 

tggggc-gct. gcaggccatl 
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aagctgtacc tgcggaggca geggctgctg cgggacctgc gccccttccc agcgcccccc 
2340 

acccactggt tccttgggco ceagaaggta aatggaaggg aaaaaggnta gaaaaggagg 
2400 

eagagggggg cggaggagga tgcggcagag gagcccagcc ggcagagaga cgcagctttc 
2«60 

ttceatccct ggggaccetc cggcttgeae cggcctttcc agcccggcct gtggetctte 
2S20 

gcatcatttt tccttgctct ggagaattg= tttcccgcag cccsscaggg aaaggtcaca 
2590 

aaagaggaag ctttgggggc tgggagagag ctatttaaag aacctgaata tggaaaaaga 
2640 

aagcgagctg rasctcaagt ctgtctctca ttgcttcacc aagscttcca catgtgtcgc 
2700 

tttaaaaata gcatgttatt ctaaataact tattagttgc agaoaotatg caaaatctst 
27 60 

cccaatcgtt ggcaccctta gtccatttte acaagagaaa ottttotttt cctaagattc 
2820 

ttgtgaagta aggagcagcc ccagccagc= actcgagaaa tactgattga tggaaatttg 
2880 

taaagggaga ctgttagctt ttggtctctc ccgtttttta aatccactcc cacccctaat 
2940 

taaggttttt attcattcaa ccgactctga gtggcaattg tgtgataggt actaagatte 
3000 

caaagagaag ctaagtccct cccctgoacc aoccaagtca ggtgcagact taggccacag 
3060 

agagaaaatg aaaatttaag gcaatgggtg ctttactaga ggcctagaga caagggaata 
3120 

tctgtcggag gaaagtatac atctccgcct agagaaggaa ggaaagtctg tgaagggetg 
3180 

agcagagtct taaaggatgg ttgggtggtg tggggaaggc attccagcag ogctactaca 
3240 

cgatcctttg gtttccccac tttctagtct tccttatata aageaaccac tttcaactct 
3300 

tttatcggtt ccttctggta tttaaatact tatttgtaaa atagtattac eatattgcat 
3360 

ctattaattt aataagttta gacatctgct gtggtttaga tatggtt.tgt tcgrtccccac 
3420 

caagcctcat gttgaaattt gattcccait gttggaggcg ggatctgatg ggagatcttt 
3480 

gggtcattgg gatggatccc tcatgaatgt cttgqtgcag ctgtctccrtt cataagttert 
3540 

cactctctta gtecctcttc oacccccago acrtgattgtt gaaaagagcc tgceacctcc 
3600 

tcccctctct cttcctgtet ctcoccetgt ggtctctgca escaacrgct cctgttcoct 



tccactatga gtggaagcag tctgagatcc tccgcagatg cagaigccaa tgccatgctt 



cttgtacagc ctgcagaatL gtaacccaaa toatcctctt tgtgaatgac ccagcctcag 
37 BO 

gtattccttt acagcaacac aaatgtacta agacaacatc cacctatgaa cttctttotg 
3840 

acaggcaatc auttavnctt catattccac tgtcocagta actatatagt attgtatttt 
3900 

ttaaatagaa aaacttctat ttgtattatt tttattatgc aaatgttatt tactgctgat 
3960 

ctaaatggtc ctctttcatt ttatttcett ttctcataga actttttecc cacccccaca 
4020 

gtottgnnr.n annrmiuinnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
4080 



3660 



3720 
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nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn rnnnnnrinnn nnnnnnnnnn rumnnnnnnn 
4110 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
4200 

nnnnnnnnnn nnnnnnnnnn ntgttotgta tctctactgt ctcatgaata ctatgtcgtc 
4260 

tgttgtttta attgoottgt tttggcatcc ttgtcaaaaa tcaattgacc ataeatgtco 
4320 

aggtetattt ctgagtcttc aattctaatc cattgatcta tatgtctatc ctaactcatg 
4383 

gacacagaga gtagaaggat ggttacoaaa ggctgggaag gatagagggg agctggggga 
4440 

ggaggtaggg aaggttaatg ggtacaaaaa aaatagaaag aatgaataac acctactatt 
4500 

tgatagcata gcagggtggc tatagtcaat aataactgta cacttttaaa taaagagtgt 
4360 

aataggattg tttgcaactc aatggataaa tgcttgaggg gatgggtacc ccattcttca 
4620 

tgatgtgcct atttcacatt gcatgcctgt atcaaaaaca tctcatttac tccataaata 
4660 

tatacaccta ctatgtatcc acaagtatta aaaattataa ataaataaat tatatagcta 
4740 

tccttatgct agtaccacac tgccttactg ttgctttgta gtaagctttg aaatcaggaa 
4000 

gtatgagtcc cccgcacttt ggtattttcc aagattattt tggctgttcg gaatccttga 
4860 

tttctataca aattttagac tcagcctatc aatttctaca aggaaaccag ctagggttct 
4920 

gcttgggatt gcactgaatc tgtagatcag tttggggatt attgccatct taagaatatt 
4980 

eggtcttctg atccatgaac acagaaagcc tttccgttta gttaggtcat ctttaatttt 
504 0 

ttttgttgtt tttttttgtt ttttgagaca gagtcctgct ctgecgceca ggctggagtg 
5100 

cagtgacgca atctcggctc actgcaacct ccgcctctcg gatccaagcg attctcctgc 
5160 

ctcagcctcc caagcagctg ggactacagg eaeatgccac cacaccaact aatttttgta 
5220 

ttttcagtag agacggggtt tcaccatatt ggccaggcta gtctcgaact cctgacctsg 
5280 

tgatccaccc gcctcaccct eccaaagtgc tgggattaea ggcgtgagcc accactcccg 
5340 

gctttcttta atttttttta acgatgtttL tgt.attt.ttc aaagtatac* tcttgcattt 
5400 

cttttgttaa atttatttgt tttgttcttt ttaatttcat ttcagactat ttattgcatt 
54 60 

catagtgttt tagagtccac attccctctt gactgtcaet aagttttttt ttttctgttt 
5520 

ttgagaggtt tctatcagaa ttttgcagat cagagatgac ggazatgtca aactgtctaa 

5580 

tattaccaac cctccccatt tatcagatca ggatcctttt ggtgattcac catgcaggge 
5640 

aatctagtat ctaaggctoa aaaggtgata ctgttttac* taggcagtaa cattttattg 
5100 

cxacataata actacatatt tatggagtac ctgtgatatt ttgatacgtg cetacaatgt 
5760 

gcagtgatca aateagggtg tttagggtat tcatcacttc taacatttat tatttatttg 
5820 

tgtttggaac atttcaagtc tcttcaagct cttcagaaat attcaataca ttattgttaa 

5880 
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cagtgctatt gaacactgga acttettcct tctatetaaa gacagtaaca ttttaagtat 
5940 

ogtcataagg ttacagaagg ataaagrgtg tatagggaaa attceetaca agatgagaat 
6000 

ttcatccctt actcttagta atacaggtct tcaaacatgc coaggatatt cctcccttgg 
6060 

egctttgaac a-gcoctjtct gtggttatat tgctcteect gcaaattatt cctaaaogsg 
61 ZO 

gcttgccctg accattcaga ctaaaatage aeetctagta ctctctatct eeaacectat 
6180 

tattatcatc .tcggecctca tcatrtc-crtg scactotact gtatactctc ttgcttgttc 
6240 

gtttattatc caccactaac tacaatatea aatctgtgag aggtaggetc tttgtctgcc 
6300 

actataascc tagtgcetgg tocogttcct ggtgcataat aggtgctcaa taaatccttt 
6360 

gttgaatgca taaatatatt aggtgccgag aaaatttatt tattcaaaga tcaatttact 
6420 

g&atagaata ggccaggtgg ttcgacattt attcaatagc caacatatgg gacctaggat 
6480 

gtacatatgc aagtgtgtgt gtgtatgtgt gtgtgcatcrt gcatgtgtsc Ltggatgtac 
6540 

tgcagagaac atctatgtag ctaagtagta taaagcaott gggctccaga gttaaactgg 
6600 

agtttgaatc ctcattagtg gttgccagct gtacacactt gggcaqatca tttaacctag 
6660 

tctgtagggc tcaatttcct catctctaaa gtagggattg taateatate tacttcatag 
6720 

ggttcttgat gtaaatatta aataacatag aacatggaaa gcatttagca geacctegtt 
67 80 

catagcagtg cttgataaat gttcgctgtt gctatttggg ggcactatgc attttctgaa 
6840 

catttctgaa caatgtttac taaatatatg tagtacccgt tttcaagtgt atttagatgc 
6900 

ttctctgggg atgaagaaat ataaattaaa tatagtacag tattcacaac agttttetgt 
6960 

cctttttgtc tagtcaggag ttacaaaaag tataatgaaa tactttcata tggctggggt 
7020 

gtttatgaoo otttrttocc taaacaaaca ottgtcatot tagtttacaa tattcatgag 
7080 

ggcaaaggcc ttgtcttcct tatatttcte tgtatctcta ccacctggta cgtg~tgatag 
7140 

acastaaaca cctgtgtgtt tattgtttgt aaatgaataa atgaaaaaat attcacattg 
7200 

ttgaooacca ctactctgga tagtcagtgg gtgcttotco ctggcrttgat tatggcaaca 
7260 

ttaacaaaaa agtgeagtat tttegaaacc aggtttcaag actctcaaco tttcagtggc 
7320 

cttgaactat ccagagaaca ctttatgggt taaaattgct aaatgataac agagaaaaat 
7380 

gggagccaga gttgtccacc tctccagagg atgagagcaa acaatcctgc agcagotacc 
7440 

gtgtgattgg tcacacgagg aaaaatctgg cagccttaag attactttgc agcgggggac 
7500 

tcccaccatc atgctcaagt gtgtagatgg gcacaccaaa acacacacat gcaggtgccc 
7560 

tccactttac acaagaagca aatgtaaatg aatettgttt tcagtgattt agagaaacaa 
7620 

tttaagfcgag ccattoctca Lctgcttcta aaagcaaaaa ctccLLctct ggtggtagta 
7680 
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tttQoictct catttgtaaa tgttggaage Egaaagtttt gtatttgagt ttgcttt*ag 
7740 

&ttcacacat ctgtgtaaat ggaccttctg ttgttggggg gagaatttgg attttcttta 
7800 

tagatagagt tggcoatttt ttagagagaa gcatttactg ctaagtcatg agaaataatc 
7860 

actggtgcat aattagagag aggaacagga agaagaaatg gtgagcrgga tgtagggtc* 
7920 

tgccccattt agtaactgtt agtttcccac ataggaaata cttrttttta gcttccagat 
7980 

cccactccaa tctgagtgtg tgatgttggc aagtgaggca gagagtgtga crcggctcec 
8040 

cctctattgg gacaagagtt cacaglaaat gtcattcaac agtgacttgg tctgggggto 
8100 

caggatatat taatattgag aagataaata cactaactrt gtttagagaa ttatccccca 
8160 

agcttagaag ccecaaagaa agcatgttat gtcactccca gaaaagtctc aggctcctct 
8220 

gcttgtgtga ccttatcagg tcctgaactc agcttgtgtc tataagaggg gacaggtcca 
B2B0 

gcttggctgg ctaattactt ttactttttt cactgcagtt tattcaggat gataacargg 
8340 

agaagcttga ggaaattatt gaaaaatacc ctogtgcctt ccctttctgg attgggccct 
8400 

ttcaggcatt tttetgtatc tatgacccag actatgcaaa gacacttctg agcagaacag 
8460 

gtaagaagag ggggaaagct etgggaccta ttectcctag aagtgaaatg cataaaacoc 
8520 

ataggcaag* ttccaaagca aagattggtt tggggccttt aagagacaca gcagcaagta 
8580 

t 9999aggtg acaggtttcc Laccaatact gaaggggatt cccatatcct ccccagtecc 
8640 

ttgtcttgtt caggtatgca tgggcacgtt gaagtcggla taacttaaag cctagctggc 
8700 

attaccagac ttgcuaggca aggcttccct tggcctctgt gggttttatg acttcagtgt 
8760 

cagcaacact tcceactcct acccctggtc tcgagcataa gtctoaagag ggigggaaat 
8820 

cagcagtaac tctacctctg ctggttcagt atgaaagcct gaatgctaga tcetteattt 
6880 

accent coga cctcttgatn nnnnnnnnnn rnrnnnnnnnn nnnnr.onniu nnnnnnnnnn 
8940 

nnnnnnnnnn ar.nnnnr.nnn nnnnnnnnnn nnnnnnnnnn nnnnr.nnnna nnnnnnnnnn 
9000 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9060 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9120 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9180 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnr.nnnnn nnnnnnnnnn 
9240 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9300 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9360 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9420 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
9480 
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nnnnnnnnr.n nnnnnnnnnn nrxinnnnnnn onnnniuinnn nnnnnnnnnn niuinnnnnnn 



nntmnnnnrjj nnnnimnnnn nnnnnnrjmn o.mnnnr.OMD rovnnnruwnji mnnnnnnwi 



n r.nrjin tin r. r> nnnnnnnnnn nnnrLnnrjtnn nnnnnnnnnn nnrcmnnrom nnnnnnnniw 
9650 

nnannnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnann nnnrrrjmnnn nnumtrun 
9120 

nnnnnnnnnn nnnnnnnnnn nnnntctgct tgactctgca gatcecaag-t eccagrtacct 
9780 

gcagaaattc tcacctccae ttcttggtat gtatgtgcaa atgagaggta taacceactc 
9840 

tcattcaaag tcccctttcc atagtagagc atgccaaaga aactgaaatc tgaartcaaa 
9900 

ageacaaaga gtgcaaggta gagctatact gaacgttatc taggggaaag attgaagggg 

9960 

agetctaagg tcaacaeaec aceacttcce agaaagcttc ttcatecgtt tctcrcccac 
10020 

aaagtcttat tctcaaggca gcagatacat gaatctgtcc cctrtetctt taeasctaca 
10080 

gccttggeca gccacagtga ctcatgcatg taatcecagc actttgggag gccaaggtgg 
10140 

gaggatcact tgaggtcaag attteaagac cagctgggcc aacatggtga aat.cccat.ct 
10200 

ctactaaaaa tacaaaaatt agccaggcet ggtagcatgt aggcctgtag tcccactact 
10260 

tgggaggctg agacatgaga atcgcttgas cetaggaggt ggaggttgcc gtgagcteag 
10320 

attgtgccac tccactccag actaggtgac agagcaaaac tctgtccgca gcccccaaca 
10380 

acaaaaaaaa aactacccaa actgcagtcb caeca tccct attcttgttt tctttatcct 
10140 

tctctcgttt tctcggatgt tttcctttct ttttggagit ccrttatttc cacatgcgag 
10500 

tcagtaaaat tttgctctag agtttggcaa tattctgtca gcagataaac tangctcttt 
10560 

aattacataa ttggtattta tgttaaacaa gacatgaatg aaagaaaaga atataggctt 
10620 

gtattaggaa ccacttaaat ttgaatcttg ccccctcctg cattgactag ttaaatatga 

10680 

tcttggggaa gtcatttaat ctctccctat ctcagtttcc tcatctttga caataaggat 
107«0 

gagactcaca ttgctgggct gttatgagga ttaaatgaaa tacatattfct tagcactaca 
10800 

tgtaatggcc accattgcat gagtgacaga tcatgcatca tgagcctgga atgttgtaag 
10060 

cattcaatga atggtatcaa ttatgtatta ataaacttta aagtcctttt aaagccac&t 
10920 

cctaatgacc agtctggcaa cagaagattg tgaagcatta gccttggtaa gtatttccac 
10980 

atagtaicat tcatagacct gggctcaagg aggaaatatc aggggacaga gtggacactc 
11040 

ttgtctcttt ccttgtgaat ttatgttcat catatagttt atggattggt ttggagtgga 
11100 

aaggaattca cttgctctgt tactagtgtg agctagggag taggttgget accCtatgta 
11160 

ttcactttca gttaacctcc acagcaacac agggaaaaag gtatttagta tcatagttca 
11220 

ttattgagaa aagtaaacct caggaagatt gagtcactta ttcagttact acataggtag 
112BO 
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taactggtga tttcaggatt agcgtgctaa tcttataagg ctttgaaatt tattagatrtr 
11340 

tgaaectgtt tctcacaata rtaaatacat ccatcccaga ggtaagcttc taaattcac: 
11400 

ttcetctatt aaattgcatt gcacattoot ecgagtacta ctttgatact ccactgttgc 
11460 

otgaetgect gfcgggtcatg gttactecac getgectgtg ttcctcatct atccttcatc 
11520 

tcatctaatt aaatggcata aggttttctg cctttt *tt t ctcaaggaaa aggactagcg 
11560 

gctctagacg gacceaagtg gttccagcat cgtcgccrtac Laactcctgg attccatttt 
11640 

aacatcctga aagcatacat tgaggtgatg gctcattctg tgaaaacgat gctggtaagt 
11700 

aaagggggae agtgetctgt gcattgcgaa atgctcccag caatggacag tattaggtot 
11760 

gtgttttgtg ggccatgaaa ataaaaaatc agtttctaaa aatttaacca atgtacacgt 
11B20 

acttattgaa caataggtgt ctgtaaaaaa tttgttatgt tctttgagtg ataatettaa 
11830 

taaaaagatc tggtcctcLg tcttagatat attttgagat tttatggcag caaaccaagt 
11940 

acoasatggt gatagttaga tagtaogtgc tgtagatgtg tttcatggag ggcgvgtc-g 
12000 

tacaaaccta occcaaagtc tgaggaaaat gagaggctga agaaaaaggo tgacagttto 
12060 

ttaaaaagaa acattcaata gaggcrttca aacaaasacc atonnrmnnn nnnnnnnnnn 
12120 

n n nn nn.nn.rt n nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12180 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnmumnno 
12240 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12300 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12360 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nr.nnwumnn nnnnnnnnnn 
12420 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
124 80 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
125*0 

nnnnnnnnnn nnnnnnnnnn nnnnnr.nrjin nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12600 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12660 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12720 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12780 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12840 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12900 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
12960 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13020 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13D60 
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nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnr. 
13140 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13200 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13260 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13320 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13380 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13440 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13500 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13560 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13620 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13690 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn nnnnnnnnnn 
13740 

nnnnnnnnnn nnnnnnnnnn nnnnnnnnng gtooggcttt gctgggggca gctccctgca 
138C0 

acagctcctc iccacacttg ctctgtttct cacttttgaa tccaaacgtt tttgaaaatg 
13860 

ttctgagrttt attttaaaat gtggcrstgg tggttgagag cagtggcagg gtacctagca 
13920 

agttcggaat cgaagttgga ggaagccctg gggtaaaccc cttgtaatta tgggtcttgt 
139B0 

gtcaatgatt gctttaatgg eactctggtc tgtttgaaag cagagttatg gtoetaatfcg 
14040 

aaaagccgca gatctttaac tcagccattt accatatatg cagttttctc catgctcctt 
141C0 

ctcactccgc tgggtgtatt tttcccttcs tcgtgecetg tgtaagcaca tggcttattt 
14160 

actcatgtga tctttggttc ctgctgggtc agggttgtet ccattagatc ataaaaacag 
14220 

ggccaggcag gagccttcaa atgaaggcaa tttggtcatg gcggtggtga tgotgttggt 
142BD 

cttgacctce tgtgccagga taagtgggag aagatttgca gcactcagga cacaagcgtg 
14340 

gaggtctatg agcacatcaa ctcgatgtct ctggatataa tcatgaaatg cgetttcagc 
14400 

aaggagacca actgccagac aaacaggtca gtggtgggag agcaaaaaag atotttcttc 
14 460 

•cattttcte agttgtttat taacacatta tcccaacttt ctcttctagc acccatgatc 
14520 

cttatgcaaa agccabattt gaactcagca aaatcatatt tcaccgcttg tacagtttgt 
145B0 

tgtatcacag tgacataatt ttcaaactca gccctcaggg ctaccgcttc cagaagttaa 
14640 

gccgagtgtt gaatcagtac acaggtattt gttgggtttg ggttgcccac gtccatacgc 
14700 

tgccatgatt gtactgtgtc tgtctagagg gataaacctt aatatgacaa gagaitagaat 
14760 

ctttgttatt aatggagctt ttatatagac actgctccaa agaaatttga cttgagtcct 
14B20 

ttataagact Ctgcttcaac catagcagta ttatcagaat ttttatatat atatatotac 
14B80 
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actatttita ttatggacaa ttattattaa tacaaatata agtaggcact taagagttcc 
14940 

egaearacat ggaatatggc tttttgcaca gcgattgcag taataataat gacaagctaa 
15000 

aaacattcot gcaaeatagg aatggagagt ggaaeagagt aa«catgg«c stgcacccga 
15060 

ongoatnttg attcaaaaac agttttagca agcataaaca caaaagttga aatagattaa 
15120 

gcrttttaag caattcaaca ttacttgtca tgaatgccat aatggagaat acttatcaag 
15180 

cagtgaatta atecttcatc agcttcacca cttactagca gttactagta agttacttac 
15240 

cgctctgttt cagtgtcatc tataaaatgg agattaaaaa agaacctatc tcatacattt 
15300 

gttgttscga tgogtgggtt aatatatata aagcatttag gacagtgcct ggcactgaat 
15360 

agatgttaaa tgtaaagtat agtlatgtca aatgtctttg cttccaggaa ttttgcaaga 
15420 

csuccaaca tatgsacact tacacataca tatatgcata catgcacata gatattataa 
15480 

agaggacact cagagaagca ggtrataaac aafcfctaaggc ataaatgggc attateaata 
155(0 

gcagcagttc ccaagtcttt ctgcatcatt gcacacacag aaaatgrttaa tgtttttgtg 
15600 

cttcattgga gtaaacagga atggatttgg gggaagctat acagaacttt gtaaaaaaaa 
15660 

atctttactt tttaaatatt acacaactat gatgaaaaag caaaatgcaa agtgttaggg 

15720 

eaaatatcaa atgttooett tattcaaaac ttaaaacctt ttcaattttt tttttttttt 
15780 

ttttttgaga tggagtctct a^cactcagg ctggagcgca gtggtg-gat cccagctcac 
15840 

tacaacctcc acctcccagg ttcaggcaat tctcctaect cagcetrctg agtagctggg 
1 5900 

attacaggca ctgccaccac acctggctaa tttttttaaa ttgtttattt ttatttagtc 
15960 

aaatatatca atettttatt ttattgcatc tggattttta gtaatcacaa aaagccattc 
16020 

tctattccag ggtttctcaa ccctcagcac taatggcttc ttagattaga taagtccttg 
16080 

tegtcaagat gtgtgcattg taggetgttt agctacatcc ctgacatcta cceactegat 
16140 

gtagtagagc tctgatagtt atagcaacco taaataactc cagscattat tgaatgttcc 
16200 

cagggccecc agttgagaac cactgcccig tacccaggtt gtagagaaaa ttatttatgt 
16260 

tttcttgtag tacttgtata atttcattat tttcatattt aaabaagaga tctaaactcc 
16320 

atttagaatt tattcctata tatggtgtga ggtattgatc taabttttcc aaatgtttafc 
16380 

ccagttgtcc catcaccatt atttaoaagt ttatcttttc aagtgatttg agataaccat 
16440 

cacattctaa acggatacat gtactggtat ctgttttgga taagagtata tttggatgtt 
16500 

ctcgtgtatt ccattgatct atctaccaat gtaccagaat oacactgttt taattaagga 
16560 

gattttgtgg cttttttcaa cattaataga ccrtattttt agaaaagttt taggtttgea 
16620 

gaaaaatcca gcagaaagta cagagagttc tcatattacc catgtaacaa acctgtacat 

166B0 
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gtaceectgt aictauat* aaagttgaaa ttttttaaat ag:aaattaa tattacctct 
16740 

gttccetatt tt.tgttttgt tttttttctc tcagctcctt c**ttataaa totattggca 
16800 

tttctttgcc tgtcttctat ttcattccat tttatttaat aacttttccg tgaagotaao 
16060 

atattagact gaggaagaaa agaotaattg gtcacttgca tctaaacttg aaatcatctt 

16920 

aattttattg cccacatact gatggaaact atgrttttta ttrgrgttgt ttatctxtgg 
16980 

agctttaatc aaaagtccct ttgatgagaa aateaaccat ctgtgaaaat tagatctatt 
170(0 

toaacgtctg gaaatceggc aagetttgan ^ctottcact aaccatggct tgctttotaa 

17100 * 

tttatttgac tttgccatca ctttggtaat tggaaactat ttttctacsc agatacaata 

17160 

atceaggaaa gaaagaaatc cctccaggct ggggtaaagc aggazaacac tccgaagagg 
17220 

aagtaccagg attttctggo tattgtcctt tctgccaagg taaatxttct aaatttctaa 
17290 

gcctgctcaa gtgaccagtt aattatgtaa gtaggtgggt aagtgggaat gggatgggga 
17340 

gacaagaata aaaccgattg actaaattta acrgtacttt gaatcgatga gcagcttcat 
17 400 

gcaatttgog aoaaagagag aattctgcaa ctgtgrtcgct agaggagggt tagxaaagac 
17460 

taaacgaaog atttgacaag atttgaggat tgtcatatgg atacatggat tttagggcat 
17520 

catgaaaaaa tggtcacatg gataaacgta aaaattatga tgataaggtc ctgggaaatc 
17 580 

tgggagtttg aagagaattt ctagggcctg ttgaLcgagg gccctttgtg caaggcctgc 
17640 

ttttcttatc taaccttggt tctcctttat gctttgggca gaat'atggtt tataccacat 

moo 

ntttgttgaa ctgaattaaa atttaaaccc ctatttaaag ctctgotttt tcccctcaaa 
17760 

tcattattgt ggttgtatct ccaoacatt- ataaactggc attttattta aaatatttgt 
17820 

attgtacttt ctaggatgaa agtggtagca gcttctcaga tattgatgta cactctgaag 
17880 

tgagcacatt cctgttggca ggacatgaca ccttggcagc aogcatctcc tggatccttt 
H940 

actgcctggc tstgaaccct gagcatcaag agagatgccg ggaggoggtc aggggcatcc 

18000 

tgggggatgg gtcttctato acttggtaag atctgcaccc ctaaattttc ctgctagttt 
18060 

tccccctgag atttcgcttt attttttgcg ctggtacctt agtgacccta gtgcctcagg 
16120 

atatgtgtag gtgaaacaga agaagtaggc tacttttctg ttctttctaa agagagctce 
18180 

aoattattct cttgtetttc aggaeasaaa eaaaagttta tttatccata aattgtctgt 
18240 

cattggtttt ctaatcaatg gtgtgtgaaa tgtcttattt ctttatttea ccttggctct 
18300 

gatgcattgg aaatgaggac trgetecetg ggctggcact tagaaertaa acaalagggt 
18360 

ccaagtggag ctcctcttct gagagagctg aatgattagc tgcattattt aaggctcatt 
16420 

ttagacatct cccagccgct tgtcaccaat tttattcctc aggattgatt ttogacttca 
18480 
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jacataatat t~gatgatat atactatagt taasrt tagc aaatatggac Cgaggacact 
18540 

tteaatactg egactttttt tetgectoca atttattgtg ggeetrtgrtct tcggtgaq;t 
18600 

aatggtctaa tacaggagac aggagacaga cctccaaatt. gcagtgtagc ataatgaggg 
18660 

eaatgataga gatatgtgct ggctaacaca aagacataga agacaggtac ctaccciggc 
1B720 

atgggagctc aaggagactt ccttgacatt tacgctgact gcaggataag taggagrttag 
18780 

ccaggtggaa actgtcatct ctatcntgct agactttaag catatactgc tgttaat^aa 
18840 

geccaggtta tgctgtttgc ooogo-ooao tgtgttcctg scat sat act ggtcaaeggg 
16900 

acagaaagac agaaatgcta aggacaattc agcagcagac cagataaaaa acaccatatt 



tcatatgcaa aagtcaactc aattgaaaca tttgtaaaac caaatttgac ettataaaag 
19020 

tatatcagag atctcatttt ataaggaaat agaagccctt tcctaccata aactaaagat 
19030 

ttaatetata togcaoaaaa tacaatgttg agtaatcatt tttaotttat tttttaactg 
19140 

acaaaaattg tgcatataca tgttatatat atatgtatgt gtgtatatat atatgatgta 
19200 

caacatgata ttttgatata tgtatacact gtggaatgac taaatctatc aatggacatg 
19260 

ttcattaact catacttatc atttttttgt ggtaaggaca tttaaaatct accctcttag 
19320 

caattttcaa gtatacaaat tgttagtaac tccaatcaca tattgtacaa tgcatctcct 
19380 

aaacttatgc ctcctgtctg actgaaattc tgtatccttt gactaacatc cctgtaatcc 
19440 

cccattctcc cacagccoct ggtaaccact gttctactcL ctgcttcttt gagtttaatg 
19500 

ttttagattt ccacatgtga gatcatgtgg aatttgtctt tctgtgcctg gcttatttca 
19560 

cttagcataa tgtcatccaa attcacctct gttgtcataa atgacaagat atttgtcttt 
19620 

tctatggcta attgrtagtc cattgtttat atatatacca tgttttcttt atceatttnt. 
19680 

ccagtgatgg acacttaagt tgetttctat atctgggcta ttgtgaataa tgctgcaatg 
19740 

aacatgggaa tgtagatgtc tcttcaatg? actgatttca tttcgtttgg ttgtatatcc 
19600 

agaagtgga* ttgctgcatc atatggtagt tctattttta attttttgag gaaactccgt 
19B60 

acaattttcc atatggctgt actaatttac attccaacca aaagtgtata ogggttctgt 
19920 

tttctccaca tcctcaecaa catttgtctc tttggtaata accattctaa tgagcatgag 
199B0 

gtgatgtctc attatggttt taatttacgt ttccctgatg attagtgatg ttgogcattg 
20040 

ttttaaatac ctgctggcca ttcatgtctt ctttgtagga atgttatttt nggtttttct 
20100 

catttttaaa tctagttatt tgttttcttg cttttgaatt gtgcgagttc ctcatatatt 
20160 

ttgaatatta sccccttatc agatgtatca tttgcagaca tgttctccca tcctttaagt 
20220 

tgtctcttca ctatgttgat tgtttccttt gttgtgcaga agctttttag tttgctgcaa 
20280 
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aaccatttat ctattttttc ttctgttgac tatocttcca gagttgtatc caaaaeatca 
20340 

ttgccaagaa taatatcaag aagcttttct ctatgttttt ttctagtagt tttatagtct 
20400 

caggtcatat gtttaaatet ttaatcsatt tttagttgat ttttgtatat ggagtgagat 
20460 

aaaggtccac ttttattett ctactagtgc atatccagtt ttctcaaeac catttattga 
20520 

agatactgcc ctttcaccac tgtatgttac tggaaccttt gtagatcagt tgacaataaa 
20580 

tgtgtgggtg tatttctgga ctcttratcc tgttttatta gtttatatgt ctcttttttt 
20640 

agaagctcta tgctgttttg gtgacrcagag ctetgtagtc aatttcagat caggtagtat 
20700 

gatgcactoc agctttgctc tttttgcrtca aaattgcttt ggctatttga gtttttttat 
20160 

tccatacgoa tttlagggct ttttL'.tttt ttcgattact gtgaataatg ccattggaot 
20920 

tttgatggag attgcattga atctttgggt egtatggata tttteacagt attaatgcCt 
20960 

ccaattaatg ascacagggt attttgcaat ttgtgttttc ttcaattbct ttcaccaglg 
20940 

tttttttctt aatttaattg ttttatttce atagggtttg ggtaacaggt ggtgtttggt 
22000 

tatgagtaag ttctttegtg gtgatttgtg agattttgat gcacccatca cctaagcagt 

21060 

atacactgta c=caatttgt agtcttgtat ccctcacctc cc tec caeca tttcccccaa 
21120 

gtccccaaag tccattgtat cattcttatg cctttgcatc ctcatagctt agctcccaet 
21190 

tatgcgtgag aacatataat gtttggttct ccatttctga gttacttcat ttagaatatt 
21240 

ggtctccaat tccatccaga ttgctgcgaa tgcctttatt ttgttcettt tcatggctga 
21300 

gtagtattcc atagtatata catcccecaa tttctttatc cattcttgat tgatgggcat 
21360 

ttggactggt tccatgtctt tacaattgcg aattgtgctg ctaeaaacat gcaggtgcaa 
21420 

gtgtcttttt catataatga cttctcttcc tctgggtaga taccctgtag tgggattgct 
21480 

ggeteaaatq gtagttctac ttttagttct ttaaggaatc tccacactgt tttccatagt 
21540 

ggttgtacta gtttacattc ccaccaacag tgtagaagtg ttccctgttc actgtatcca 
21600 

caccatcatc tattattatt tgattttttg attatggcca ttcrtgcagg agtaaggxgg 
21660 

tattgcactg tggttttgat ttgcatttcc ctgatcatta gtgatgttga gcattttttc 
21720 

atatatttgt tggccatttg tacatcttct tttgagaatt gtctattcat gtcctttgtc 
21780 

cattttttga tgggattatt tgtttttttc ttgctaattt gagttccctg tagattctgg 
21840 

atattagacc tttgttggat gtgtaggttg tgaagatttt ctcccactct ttgggttgtc 
21900 

tgtttactct gctgattatt tcttttgctg tgcagaaact ttttagttta attaagtcsc 
21960 

ecctatttat cttttcgttg ttgttgtttt ttggggttgt tttgttttgg cttggttttg 
22020 

catctgcttt tgggttettg gtcatgaagt ctttgectaa gccaatatcx agaagggttt 
22000 
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ttctgotgtt ctagnatttt tatggttcog gtcttagatt taagtccttg etccatcttg 
22140 

egttgatttt tgtataaggt gagagatgag gatccagttt catgcttcta catgtggctt 
22200 

gccaattatc ccagtacout ttgttgaata gggttaatat ttoauycttt atatatttag 
22260 

gtgttcctat tttgggtaca tatttattta saactatcat otcctcctga tggattgace 
22320 

cctttctcat tatataatgg tcttcttgtc tctttttaca gtttttgtct taaagcctaa 
22380 

tttgLCtgat aaaagLtcag ctaccrttgc tctcttttgg tttctatttg catggaatat 
22440 

ttttttccaa cccttcgcat tcactctatg tgtgttctta aagatgaaat gagatgctgt 
22500 

aggggcatat gcttgggtct tgttttattc attcattoag ccaccctttt gattagagaa 
22560 

tttaattcat ttgtattcaa ggtaattatt gacagacaag gacttactac tgccattttg 
22620 

ttaottgttt tcttgatgtt ttatagatct tttgttcctt tcaccctctc ttactctLtt 
22630 

cctttgtgat taggtgcttt tctctagtgg tgtactttga tttttacttt ttatcttttg 
22740 

ttgctctact ataggttttt gctttgtggt taccatgagg gttacataaa gcatagttat 

22900 

aaaaggctat tttaaactga taacagctta actttcaaca cttaaaaaaa ctatacacrtt 
22860 

ttactctacc aactgccctc cattttatgt ctttgatgtc ataatttacc tagttttgga 
22920 

gatgtgtccc cttattgtgt atcccttaac aaattattqi agcaacagtc atttttao-o 
22980 

gttttggctt ttaactttat actagagata gaattaatta acacaccacc actacattat 
23040 

taggqxatec taaattgact atgtettta? ctttatcegt gagatttttg tttteaattt 
23100 

tcatgttgtc aattogtatt ctttcattt" aacttggaga actcacatta gcattttttg 
23160 

teagatgggt ctagtagtgg tgaacaccct caacttttgt ttatctggag atgtctttac 
23220 

ctctgcttca ttttgaaate toacttt.tgt tceatgattg aaatggacaa aattgttttt 

23280 

ttaattatgc aaagtgccag ggtaagcaga attactcttt trtttttttt ctgagaccga 
23340 

gtttcactct tgttgcccag gctggagtgs agtggcgcaa tctctcagct taccgcaacc 
23400 

tctgcctccc aggttcoagc gnttcttctg cctcagcctt cctgagtagc tgggattaca 
234 60 

ggcatgcacc accatgctcg gctaattttg catttttagt agagacgggg tttctccatg 
23520 

ttggtcaggc tggzcttgaa cacccgaccc cagatgatcc gcccacctag gcctcccaaa 
235BO 

gtgctgggat tgcaggtgtg agccactgcg cctggccaga attactctra tttatcctga 
23640 

gcttgaggaa gaaagaattc aaaattaaaa tttcacatta cctaatggac aaagcctgca 
23700 

ttcaaaataa gtaatcagaa aaacatataa aaacacaata agataaacag actaaatata 

237 60 

tgcagtcatt fctatggaacc aatctgacta gattggatgc agactaggta ggacgcaaat 
23820 

ttaaaaaaaa cttcattctt cttccactta taaactttaa acctgctttg tggageaagt 
23980 
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tctttttetc tctggggaaa gatcctoagt aagtctcatn gagttctcat tcatttaaat 
23940 

cocanganc* atct taggtc agtaottaaa ctatctggcc cagtgtaato ctgaaactct 
24000 

caaatactta tccacttgag etcttctttc catcccagct tgg^acttct ttggtcctag 
24 060 

aagccagcag tggtttatca tcgacttatt cttactgact agctceccaa tacccagtag 
24120 

ctgctgtttc tggcccctcc aggaatggtt ttaggaggaa aggggataag gagtaaaggg 
24180 

ctggcactat tgtgatcatg eeaaagggct tggtggatat tccatgcttc cctttctctc 
24240 

aagaggaaac tceetttett ggagactctc tcsctagaac tttccagagg tgattcaggg 
24300 

gacaagagaa taattgtcct teggcagact ctttttcaag ctggtcccag agclttccct 
24360 

cttcccagtt aattggttta aggacacagt tgcacatcct tgcettgcct ctgctgctgt 
24420 

ectctgcetfc tctgtctgtt ctgagttata gcctttcaca tcagtcctgt aetccccaaa 
244B0 

ctccaaggag oacaagtcag atcatctaag tgatcctctt gaagcctctt gtttaagatg 
24540 

ggggaagcac ccttcctttt ccatggcact ctggcattcc aacaacaett taaataattt 
24 600 

Lttctctcaa aatlcttaag cctctoctct ttaatccttc gccattttta tgtattatta 
24660 

ctttatatga tgagctaaga gttacaaaac tggtttttag aaatctcctt agcaaatgtt 
24720 

ttactgctag tttagcagct cactttataa taaggatata tgatatattt ctttggttcc 
24760 

tctgcctctg ggacctcagc tcatcctgag gcagagagtc ccottttaac attctgttac 
24840 

ataaaccagt ggcaaaatgg ctttaacctg agggtaataa ttaccaggaa caaacagaaa 
24 900 

acagaaaaea agtaaactgg ttetgatatc tgagtccctt ccctccctca tcctcacagg 
24 960 

gaccagctgg gtgagatgtc gtacaccaca atgtgcatca aggagacgtg ccgattgatt 
25020 

cctgcagtcc cgtccatctc cagagatctc agcaagccac ttaccttccc agatggatgc 
25080 

ecattgectg eaggtcttta cattctttts ctaagcagtt cttagaggct atgggatcct 
25140 

ggagoccaca gtgacaasga ttogtgagtc tcttagcoct tggagaegtc aaaagataat 
25200 

gctaacatgt gacttaggtt ttatcaccta tgaggagcte agaggataat getttggtca 
25260 

gacatgaatt tcaatgactt tcccaaagg= acatagccag ttgcagcaaa gctaagccca 
25320 

gaatccatgt ctctggaatc ccagcccagg gtctcttcca ttgtgggaca tcatttctea 
25380 

gataatcttt gtttggctga gtttgagacc gagctgaaac ttcatggaaa atagcaccag 
25440 

catctttatc tgaaagacca agggggatct ttggcctcat cat oat data tcacccttat 
25500 

aaataLacaa cattlaatag tLaatataga gccltcagac ccattecctc atttttcccc 
25560 

ttggoatcca atgttaacag atgcttater aatgatttac agttcactgo acacttttaa 
25620 

gtactttcaa tgtggcccaa aatccagagg cagccccaat gtgtagatga cattaactga 
25680 
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tgtgagcog* gctagaaert gtgcggagac cctgagtctg gagcctagag ttettcggaa 
25740 

ceacacaggt ttctgagcag ggcttetagg oogcagaggg gtcatgtgeg acatattatc 
2 5800 

tgattcaatg ttctattaat tcatgtctta ggaagcaagc caacaggatt gcttctggca 
25660 

aaeacctaca gcctgttact gtaactttgc tgacagaccc agaattaatt tctggaagst 
25920 

aqsottattt ctggaaacca aacaaccctc acattctctx tcctttgttt tgcactctgt 
25980 

ttctccccaa aecacatgga tatttgccaa aettctccac tttccatatg tgaatagcac 
26040 

eaatggasBt ttgtcatggg atctgcatga cogoatcoco gttctgtgtg tgtgtgtgtg 
26100 

cgttttcctc tcaagacaga gtcttgetat gtagcccagg crggagtaca gtggcgtaat 
26160 

ctcggctcac tgcaacctcx gcctcccagg tttaagcagt tctcctgsct cagcctccsg 
26220 

agtagctggg attacaggtg cacaccacgc ctggcaaatt tttgtatttt tattagagat 
26230 

ggggtttcoc catgttggcc aggctagtct caagctcctg atcccgagac cagccctcst 
26340 

cagcctccca aagcgctggg actacagcca tgagccactg cacccagcca gttctgtgct 
26400 

tttataccta aattgtctcc aggagtgctt aatagtccat taataggtat ttoggccagg 
264 60 

cacagtggct gacgcatata atcccaatat tttgtgacac caaggtggga agactgcttg 
26520 

aagttaggag tctgagacta gcctgggcaa catagggaga eectgtettt acaaaaaaaa 
26590 

aaaagagaga gatagccagg catggtgttg cjtgcttgta ttcctgccta cttgggggac 
26640 

tgaggcagga ggatcacttg agctcagaag ttcaaggtta ccgtgagcas tgttcacgsc 
26700 

actgctctcc agcetgattg acaggccaga ccctgactct aaacaaaaac aaaaaacaaa 

2 67 60 

tatttaagta atttceaaac atagcagaaa atataagcat ggtctatcac tttgatatga 
26820 

caccaacagc eacttaagat agagtcatga attcagtaaa ttgttgtgtg gaaagctaeg 
26880 

gtgccaaccc aagccgcatc ttcttaggtg ctcctcactg gtgtcntoog ctacagcagg 
26940 

cagagcattg ccaggagcta gctcttccct tcaagaacaa aagccttgtt taagagcaca 
27000 

gtagcccaca acttgetctt tcccctgcag tctcttttat ttccctcctt tcttagggat 
27060 

caccgtggtt cttagtattt ggggtcttca ccacaaccct gctgtctgga aaaacccaaa 
27120 

ggtetgattc tctcttgtac ataaatactt ccaegaacta atgctgtgca agtcactttt 
27180 

tggtagctaa gcacagaagt ggctatatea ttaoqggaeo tgacecoaat taaacaaaaa 
27240 

taaacataaa agccaaaaga aatgtaaaac tattctatgt tcttgeaaca ctcttgacgt 
27300 

gtatcagtga tttccttcat gtaagccaet aaggtttaag ateeattact fcgtaacagga 
27360 

•gctggagta tatgtctctg taataattgg ccacatcatc attttgactt gatttctang 
27420 

tggatgcaca tccatttcta agtggatgta tctccatagt gaaaataata ccacttgcsa 
27480 
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tagtatttct gtttgcetgg gtatcagaea aatcagctgt gaagctgcaa ggtctgcagg 
27540 

tctgonggfca cactgcccag tgtagtagcc acgggccaco tacggctcct gagcacatga 
27 500 

catgtggcca gttggaattg etjttgt gctg taagtttaaa atacgtgctg gattttgaag 
27660 

acatagtacc ctaaaaaaat gtgaaacatt tccttttagt aattatttat attgattasa 
2^720 

ggttggaatg gtaatttttg gttaaataaa ctctattaag attaacttca cctcttoaoa 
27780 

atgtgaccac cagaacattr teoattaca= atgtagatca cettotattt ctattgatsg 
278*0 

gtgctaggtg gtaggtgaag aaatgtgttc atgrttgtttg ggggetggtg ttggggttgt 
27 900 

ectctcattt caggtctttg accccttgag gttctetcag gagaattctg atcagagaca 
27960 

cccctatgcc tacttoccat tctcagctgg atcaaggtga gaacaatttg aagtcgctga 
26020 

aagtacccaa agatgtttae ttgagagtag tttottcctt tcagctcTtc agctctatac 
2B330 

attcttccag ggaaccgtag otcttggtgz ctatttgagc ccoaaaggat cagttagttt 
20140 

tacaaaygac aatcgtattc tctgtcacat ccctttcgyc catgcctcaa aagcagtccc - 
28200 

acaotgtaag ctactgctca taggctcaat gcagtccacc ttcaaagcaa gagaaataat 
28260 

ttcatgagta actccaactg cogccttgtt atagggaagg catcetgttg gagectccsa 
2B32C 

gctcaaattc tcacagtgaa caatttaagt ctaaagttoa aaagtttcaa tggcatttgg 
2B3B0 

tggaaaaaat atcactttac tgtgtacttc agacttcttg tactagtatt ttactatagt 
2B443 

cagaagaaac atcatttttt caagtatcac tttctttccc tcttgtcttc aggaactgca 
28500 

ttgggcagga gtttgccatg attgagttaa aggtaaccat tgccttgatt ctgctccaet 
28560 

tcagagtgac cccagacccc accaggcctc ttactttccc caaccatttt atcctcaagc 
28620 

ccaagaatgg gatgtatttg cacctgaaga aactctccga atgttagatc tcagggtaca 
26680 

atgattaeac grtactttgtt tttcgaagtt aaatttacag ctaatgatcc aagcagatag 
28740 

aaegggatea atgtatggtg ggaggattgg aggttggtgg gataggggtc tctgtgaaga 
28800 

gatccaaaat catttctagg racacagtgt gtcagctaga tctgtttcta tataactttg 
26860 

ggagattttc agatcttttc tgttaaactt tcactactat taacgctgta tacaccaata 
28920 

gactttcata tattttctgt tgtttttaaa atagttttca gaattatgca agtaataagt 
28980 

gcatgtatgc tcactgtcaa aaatteccae cactagaaaa tcacgtagaa taaaaatttt 
29040 

aaatctcact tcacttagcc gacattccat gccctgacca atcctactgc ttttcctaaa 
29100 

aacagaataa tttggtgtgc attctfctcag actttttcct atacatttta tatgtagaaa 
29160 

tgtagcaatg tatttgtata gatgtgatca ttcctatatt gttattgatt tttttcactt 
29220 

■ataaaaatt caccttattc cttotcattg ctttatggta ttctgtaata tgaatgtact 
29280 
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ataatttatt toactatttt ccttattggg catttaagtt atttctagtt ttaaaaacat 
29340 

gcttgtcaat ggcaacaaaa gccaaaattg acaaatggga tctaaetaaa ctaaagagct 
29400 

tctgcacagc aaoacaaact accatcacac tgaatgggca gcctacagea tgggagaaaa 
294 60 

tttttgcaac ctactcatct gacaaaggcc taatatccag aatctacaat gaactcaeac 
29520 

naatgtacaa gaaaaaaaca accccatcaa aaagtgggtg aaggatatga acagacactt 
29S80 

ctcaaaagaa gacatttacg cagccaaaag acacatgaaa aaatgcctat cgtcactggc 
29640 

catcagagaa atgcaaatca aaaccacaat gagataccat ctcacaccag ttagaatggc 
291O0 

aatcattaaa aagtcaggaa acaacaggtg ctggagagga tgtggagaaa taggaagacr 
29160 

tttacactgt tggtggcagg agaatcactt gaacccggga gggggaggtt gcagtgagcc 
29820 

gaggtggcgc cactgcactc cagcctgggc gacagaacga gtactccatc tcaaaaaaaa 
29880 

aaaaaaagga c&ccaaactt ctcaotctta atgttgtcat ctaigtggta tcttccataa 
29940 

tctctctcag acagagtcat cttttgctga tatgatctta cagtattttt tgtttabacc 
30000 

attataatct cattaattgc agcaacacaa atgacaaaag a=aactgatt tctccccttg 
30060 

gatgacctaa tttgctctca ctcttccatc atcocttata acatgstgat tcxcaaattc 
30120 

atctacctaa aatctotata taaaaaantc cctcccttge attccagatc cttggegaca 
30180 

aacacccacg tstaaaacca aaCttgttta acactggacc agtcgtcctg tgtgactttc 
30240 

c«ttttgtco ctattttgtc agctggtata ccaatatcca cecagttaaa caatatttcc 
30300 

ttgttttttt ctggtacaaa cccaaataaa ttacaaacat caataaaagt aaaattcta* 

30360 

aataactcac tttctctata tatctccttc ttgctggaaa aatgggttag gttagttett 
30420 

taaaagcatg eatgataaat tgtactgaat acaatattca ggtccggaca tactaggtat 
30480 

aattttctgt gtctctgggg tcttacctat ttggggtcaa aataaacaag tttattaagc 
30S4O 

tcattaatat Icaatctcat tatcttcttt aacaattatg ttccctggta gtttcattgc 
30600 

caataattta tttgtcaggt Lgccaggtgo ttctaaactt ctgtgtattt tttcatatcc 
30660 

aattttactt taaatatttt tagaaaagag gtctgttaaa tttcctaata attattatat 
30720 

tattgttttt tcactgacat tttgtgaatt gaaaaccctt aaaaatatga aatcattttt 
30780 

tcgaeatotg tgccacagac aattttgtta aataagaaga cagaaacagg gcettatcaa 
30840 

gagataaata ttcaatatac cttatatttc tgtcacacat ttttatacca actgtgccaa 
30900 

aaattgtata tcatataaat gataacaagt tcacaaaggc attcctttat cccttaactc 
30960 

Ccaaattaga a act t: teat a ggtaggaagt aggggaagca tatattccct ttgaaaggtg 
31020 

caagaaaatg tcattggcat tcaccatggt actcttcaag cttaaaaaaa atggactgca 
31080 
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*a*cattt*c aaacatagca tatttattgg gtacct f.at gtttacataa atattgaaga 
31140 

tatctcacet acctctttca atcagattat ctcactgaca tttattgacc actttctatg 

31200 

gggaaaac 

31208 
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